{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "fbc94ec4-be4e-4947-83ba-92fb7bfda8d7",
   "metadata": {},
   "source": [
    "# Search Best Architecture and Hyperparameter\n",
    "\n",
    "Sometimes (or often) we do not know exactly which architecture is the best for our data. In artificial intelligence, it is common for an architecture to be the best for one dataset and not so good for another dataset. To try to help to find the best solution, this Notebook will use two main function in PyTorch Tabular. One of them is Sweep to run all architecture available in PyTorch Tabular with default hyperparameters to search for the possible best architecture for our data. Afterward, we will use Tuner to search for the best hyperparameter of the best architecture that we found in Sweep."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "35da73dc-0ef3-43e6-b4d9-f9850e413e19",
   "metadata": {},
   "outputs": [],
   "source": [
    "import warnings\n",
    "warnings.filterwarnings(\"ignore\")\n",
    "\n",
    "from sklearn.model_selection import train_test_split\n",
    "\n",
    "from pytorch_tabular.utils import make_mixed_dataset\n",
    "from pytorch_tabular.config import DataConfig, OptimizerConfig, TrainerConfig"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f429d88e-47aa-4ffd-9b38-705efee840b3",
   "metadata": {},
   "source": [
    "## Data\n",
    "First of all, let's create a synthetic data which is a mix of numerical and categorical features and have multiple targets for classification. It means that there are multiple columns which we need to predict with the same set of features."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "656a1c27-6c5b-4813-9429-00d94a3e0ef1",
   "metadata": {},
   "outputs": [],
   "source": [
    "data, cat_col_names, num_col_names = make_mixed_dataset(\n",
    "    task=\"classification\", n_samples=3000, n_features=7, n_categories=4\n",
    ")\n",
    "\n",
    "train, test = train_test_split(data, random_state=42)\n",
    "train, valid = train_test_split(train, random_state=42)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2fdf4392-b2bd-4c09-a51c-99e521983b0c",
   "metadata": {},
   "source": [
    "## Common Configs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "6526eb73-ecf7-4a06-b905-bb6326fefd19",
   "metadata": {},
   "outputs": [],
   "source": [
    "data_config = DataConfig(\n",
    "    target=[\n",
    "        \"target\"\n",
    "    ],\n",
    "    continuous_cols=num_col_names,\n",
    "    categorical_cols=cat_col_names,\n",
    ")\n",
    "trainer_config = TrainerConfig(\n",
    "    batch_size=32,\n",
    "    max_epochs=50,\n",
    "    early_stopping=\"valid_accuracy\",\n",
    "    early_stopping_mode=\"max\",\n",
    "    early_stopping_patience=3,\n",
    "    checkpoints=\"valid_accuracy\",\n",
    "    load_best=True,\n",
    "    progress_bar=\"none\"\n",
    ")\n",
    "optimizer_config = OptimizerConfig()"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a0307898-05bb-453f-8890-522c20555157",
   "metadata": {},
   "source": [
    "## Model Sweep\n",
    "https://pytorch-tabular.readthedocs.io/en/latest/apidocs_coreclasses/#pytorch_tabular.model_sweep\n",
    "\n",
    "Let's train all available models (\"high_memory\"). If some of them return as \"OOM\" it means that you do not have enough memory to run in the current batch_size. You can ignore that model or reduce the batch_size in TrainerConfig."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "19dd4505-bb43-413e-b569-901c60ee319b",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pytorch_tabular import model_sweep"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "38f8ebb6-9063-4af5-a556-193942fb6c8a",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "7c83b0250f654ebb8fb0cf8bb41ae743",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Output()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">07</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">20</span> <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">12:47:01</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">862</span> - <span style=\"font-weight: bold\">{</span>pytorch_tabular.models.node.node_model:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">73</span><span style=\"font-weight: bold\">}</span> - INFO - Data Aware Initialization of NODE   \n",
       "using a forward pass with <span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2000</span> batch size<span style=\"color: #808000; text-decoration-color: #808000\">...</span>.                                                                      \n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1;36m2024\u001b[0m-\u001b[1;36m07\u001b[0m-\u001b[1;36m20\u001b[0m \u001b[1;92m12:47:01\u001b[0m,\u001b[1;36m862\u001b[0m - \u001b[1m{\u001b[0mpytorch_tabular.models.node.node_model:\u001b[1;36m73\u001b[0m\u001b[1m}\u001b[0m - INFO - Data Aware Initialization of NODE   \n",
       "using a forward pass with \u001b[1;36m2000\u001b[0m batch size\u001b[33m...\u001b[0m.                                                                      \n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n"
      ],
      "text/plain": []
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "sweep_df, best_model = model_sweep(\n",
    "                            task=\"classification\",\n",
    "                            train=train,\n",
    "                            test=valid,\n",
    "                            data_config=data_config,\n",
    "                            optimizer_config=optimizer_config,\n",
    "                            trainer_config=trainer_config,\n",
    "                            model_list=\"high_memory\",\n",
    "                            verbose=False # Make True if you want to log metrics and params each trial\n",
    "                        )"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "39758772-f5e3-49da-bf5d-f5a1fffb8ded",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
       "┃<span style=\"font-weight: bold\">        Test metric        </span>┃<span style=\"font-weight: bold\">       DataLoader 0        </span>┃\n",
       "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n",
       "│<span style=\"color: #008080; text-decoration-color: #008080\">       test_accuracy       </span>│<span style=\"color: #800080; text-decoration-color: #800080\">    0.8053333163261414     </span>│\n",
       "│<span style=\"color: #008080; text-decoration-color: #008080\">         test_loss         </span>│<span style=\"color: #800080; text-decoration-color: #800080\">    0.44678735733032227    </span>│\n",
       "└───────────────────────────┴───────────────────────────┘\n",
       "</pre>\n"
      ],
      "text/plain": [
       "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
       "┃\u001b[1m \u001b[0m\u001b[1m       Test metric       \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1m      DataLoader 0       \u001b[0m\u001b[1m \u001b[0m┃\n",
       "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n",
       "│\u001b[36m \u001b[0m\u001b[36m      test_accuracy      \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m   0.8053333163261414    \u001b[0m\u001b[35m \u001b[0m│\n",
       "│\u001b[36m \u001b[0m\u001b[36m        test_loss        \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m   0.44678735733032227   \u001b[0m\u001b[35m \u001b[0m│\n",
       "└───────────────────────────┴───────────────────────────┘\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "[{'test_loss': 0.44678735733032227, 'test_accuracy': 0.8053333163261414}]"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "best_model.evaluate(test)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "f4c3c158-0ffc-4805-bbb2-fa422172d3c9",
   "metadata": {},
   "source": [
    "In the following table, we can see the best models (with default hyperparameters) for our dataset. But we are not satisfied, so in this case we will take the top two models and use Tuner to find better hyperparameters and have a better result.\n",
    "\n",
    "**PS: Each time that run the Notebook the result may change a little, so you might see different top model that we will use in the next section.**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "816f117c-e7be-4814-9928-010306cced88",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_653b4_row0_col2, #T_653b4_row0_col3, #T_653b4_row0_col4 {\n",
       "  background-color: #006837;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_653b4_row1_col2 {\n",
       "  background-color: #219c52;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_653b4_row1_col3 {\n",
       "  background-color: #1b9950;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_653b4_row1_col4 {\n",
       "  background-color: #199750;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_653b4_row2_col2, #T_653b4_row4_col3 {\n",
       "  background-color: #dff293;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_653b4_row2_col3 {\n",
       "  background-color: #c3e67d;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_653b4_row2_col4 {\n",
       "  background-color: #0b7d42;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_653b4_row3_col2 {\n",
       "  background-color: #c1e57b;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_653b4_row3_col3 {\n",
       "  background-color: #d3ec87;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_653b4_row3_col4 {\n",
       "  background-color: #148e4b;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_653b4_row4_col2 {\n",
       "  background-color: #fafdb8;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_653b4_row4_col4 {\n",
       "  background-color: #0c7f43;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_653b4_row5_col2 {\n",
       "  background-color: #e34933;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_653b4_row5_col3 {\n",
       "  background-color: #fff6b0;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_653b4_row5_col4, #T_653b4_row8_col2, #T_653b4_row8_col3 {\n",
       "  background-color: #a50026;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_653b4_row6_col2 {\n",
       "  background-color: #c41e27;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_653b4_row6_col3 {\n",
       "  background-color: #fee491;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_653b4_row6_col4 {\n",
       "  background-color: #feeda1;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_653b4_row7_col2 {\n",
       "  background-color: #de402e;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_653b4_row7_col3 {\n",
       "  background-color: #fdc372;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_653b4_row7_col4 {\n",
       "  background-color: #cfeb85;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_653b4_row8_col4 {\n",
       "  background-color: #15904c;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_653b4\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank level0\" >&nbsp;</th>\n",
       "      <th id=\"T_653b4_level0_col0\" class=\"col_heading level0 col0\" >model</th>\n",
       "      <th id=\"T_653b4_level0_col1\" class=\"col_heading level0 col1\" ># Params</th>\n",
       "      <th id=\"T_653b4_level0_col2\" class=\"col_heading level0 col2\" >test_loss</th>\n",
       "      <th id=\"T_653b4_level0_col3\" class=\"col_heading level0 col3\" >test_accuracy</th>\n",
       "      <th id=\"T_653b4_level0_col4\" class=\"col_heading level0 col4\" >time_taken_per_epoch</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_653b4_level0_row0\" class=\"row_heading level0 row0\" >1</th>\n",
       "      <td id=\"T_653b4_row0_col0\" class=\"data row0 col0\" >CategoryEmbeddingModel</td>\n",
       "      <td id=\"T_653b4_row0_col1\" class=\"data row0 col1\" >12 T</td>\n",
       "      <td id=\"T_653b4_row0_col2\" class=\"data row0 col2\" >0.458506</td>\n",
       "      <td id=\"T_653b4_row0_col3\" class=\"data row0 col3\" >0.797513</td>\n",
       "      <td id=\"T_653b4_row0_col4\" class=\"data row0 col4\" >0.190966</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_653b4_level0_row1\" class=\"row_heading level0 row1\" >3</th>\n",
       "      <td id=\"T_653b4_row1_col0\" class=\"data row1 col0\" >FTTransformerModel</td>\n",
       "      <td id=\"T_653b4_row1_col1\" class=\"data row1 col1\" >272 T</td>\n",
       "      <td id=\"T_653b4_row1_col2\" class=\"data row1 col2\" >0.486184</td>\n",
       "      <td id=\"T_653b4_row1_col3\" class=\"data row1 col3\" >0.770870</td>\n",
       "      <td id=\"T_653b4_row1_col4\" class=\"data row1 col4\" >0.529126</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_653b4_level0_row2\" class=\"row_heading level0 row2\" >4</th>\n",
       "      <td id=\"T_653b4_row2_col0\" class=\"data row2 col0\" >GANDALFModel</td>\n",
       "      <td id=\"T_653b4_row2_col1\" class=\"data row2 col1\" >8 T</td>\n",
       "      <td id=\"T_653b4_row2_col2\" class=\"data row2 col2\" >0.562945</td>\n",
       "      <td id=\"T_653b4_row2_col3\" class=\"data row2 col3\" >0.705151</td>\n",
       "      <td id=\"T_653b4_row2_col4\" class=\"data row2 col4\" >0.341467</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_653b4_level0_row3\" class=\"row_heading level0 row3\" >8</th>\n",
       "      <td id=\"T_653b4_row3_col0\" class=\"data row3 col0\" >TabTransformerModel</td>\n",
       "      <td id=\"T_653b4_row3_col1\" class=\"data row3 col1\" >272 T</td>\n",
       "      <td id=\"T_653b4_row3_col2\" class=\"data row3 col2\" >0.547346</td>\n",
       "      <td id=\"T_653b4_row3_col3\" class=\"data row3 col3\" >0.696270</td>\n",
       "      <td id=\"T_653b4_row3_col4\" class=\"data row3 col4\" >0.470920</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_653b4_level0_row4\" class=\"row_heading level0 row4\" >0</th>\n",
       "      <td id=\"T_653b4_row4_col0\" class=\"data row4 col0\" >AutoIntModel</td>\n",
       "      <td id=\"T_653b4_row4_col1\" class=\"data row4 col1\" >14 T</td>\n",
       "      <td id=\"T_653b4_row4_col2\" class=\"data row4 col2\" >0.580009</td>\n",
       "      <td id=\"T_653b4_row4_col3\" class=\"data row4 col3\" >0.689165</td>\n",
       "      <td id=\"T_653b4_row4_col4\" class=\"data row4 col4\" >0.360073</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_653b4_level0_row5\" class=\"row_heading level0 row5\" >5</th>\n",
       "      <td id=\"T_653b4_row5_col0\" class=\"data row5 col0\" >GatedAdditiveTreeEnsembleModel</td>\n",
       "      <td id=\"T_653b4_row5_col1\" class=\"data row5 col1\" >79 T</td>\n",
       "      <td id=\"T_653b4_row5_col2\" class=\"data row5 col2\" >0.673274</td>\n",
       "      <td id=\"T_653b4_row5_col3\" class=\"data row5 col3\" >0.660746</td>\n",
       "      <td id=\"T_653b4_row5_col4\" class=\"data row5 col4\" >3.624957</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_653b4_level0_row6\" class=\"row_heading level0 row6\" >2</th>\n",
       "      <td id=\"T_653b4_row6_col0\" class=\"data row6 col0\" >DANetModel</td>\n",
       "      <td id=\"T_653b4_row6_col1\" class=\"data row6 col1\" >431 T</td>\n",
       "      <td id=\"T_653b4_row6_col2\" class=\"data row6 col2\" >0.692986</td>\n",
       "      <td id=\"T_653b4_row6_col3\" class=\"data row6 col3\" >0.644760</td>\n",
       "      <td id=\"T_653b4_row6_col4\" class=\"data row6 col4\" >2.104359</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_653b4_level0_row7\" class=\"row_heading level0 row7\" >6</th>\n",
       "      <td id=\"T_653b4_row7_col0\" class=\"data row7 col0\" >NODEModel</td>\n",
       "      <td id=\"T_653b4_row7_col1\" class=\"data row7 col1\" >864 T</td>\n",
       "      <td id=\"T_653b4_row7_col2\" class=\"data row7 col2\" >0.676671</td>\n",
       "      <td id=\"T_653b4_row7_col3\" class=\"data row7 col3\" >0.626998</td>\n",
       "      <td id=\"T_653b4_row7_col4\" class=\"data row7 col4\" >1.497243</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_653b4_level0_row8\" class=\"row_heading level0 row8\" >7</th>\n",
       "      <td id=\"T_653b4_row8_col0\" class=\"data row8 col0\" >TabNetModel</td>\n",
       "      <td id=\"T_653b4_row8_col1\" class=\"data row8 col1\" >6 T</td>\n",
       "      <td id=\"T_653b4_row8_col2\" class=\"data row8 col2\" >0.708919</td>\n",
       "      <td id=\"T_653b4_row8_col3\" class=\"data row8 col3\" >0.538188</td>\n",
       "      <td id=\"T_653b4_row8_col4\" class=\"data row8 col4\" >0.484836</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ],
      "text/plain": [
       "<pandas.io.formats.style.Styler at 0x792c240fe9b0>"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sweep_df.drop(columns=[\"params\", \"time_taken\", \"epochs\"]).sort_values(\"test_accuracy\", ascending=False).style.background_gradient(\n",
    "    subset=[\"test_accuracy\"], cmap=\"RdYlGn\"\n",
    ").background_gradient(subset=[\"time_taken_per_epoch\", \"test_loss\"], cmap=\"RdYlGn_r\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ac592e51-e44f-4964-aa1c-d1403e8297ed",
   "metadata": {},
   "source": [
    "## Model Tuner\n",
    "https://pytorch-tabular.readthedocs.io/en/latest/apidocs_coreclasses/#pytorch_tabular.TabularModelTuner\n",
    "\n",
    "Perfect!! Now that we know the best models, let take the top two and play with their hyperparameters to try find better results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "id": "6f938181-199c-47c6-a9db-167185835f96",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pytorch_tabular.models import (\n",
    "    CategoryEmbeddingModelConfig,\n",
    "    FTTransformerConfig\n",
    ")   "
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2cdc3963-fd5e-4b09-9b72-264d724c865a",
   "metadata": {},
   "source": [
    "We can use two main strategies: \n",
    "- grid_search: to search for all hyperparameters that were defined, but remember that each new fields that you add will considerably increase the total training time. If you configure 4 optimizers, 4 layes, 2 activations and 2 dropout, that means 64 (4 * 4 * 2 * 3) trainings.\n",
    "- random_search: will get randomly get \"n_trials\" hyperparameters settings from each model that has been defined. It is useful for faster training, but remember that will not test all hyperparameters.\n",
    "\n",
    "\n",
    "For all hyperparameters options: https://pytorch-tabular.readthedocs.io/en/latest/apidocs_model/\n",
    "\n",
    "More informations about how the hyperparameter spaces work: https://pytorch-tabular.readthedocs.io/en/latest/tutorials/10-Hyperparameter%20Tuning/#define-the-hyperparameter-space"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "13495cba-9d5d-4bd4-8301-6c5b1f043d4f",
   "metadata": {},
   "source": [
    "Let's define some hyperparameters.\n",
    "\n",
    "PS: This Notebook is to exemplify the functions and does not mean that are the best hyperparameters to try."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "e5eb69d1-68e2-458a-9df0-b440d148e3ee",
   "metadata": {},
   "outputs": [],
   "source": [
    "search_space_category_embedding = {\n",
    "    \"optimizer_config__optimizer\": [\"Adam\", \"SGD\"],\n",
    "    \"model_config__layers\": [\"128-64-32\", \"1024-512-256\", \"32-64-128\", \"256-512-1024\"],\n",
    "    \"model_config__activation\": [\"ReLU\", \"LeakyReLU\"],\n",
    "    \"model_config__embedding_dropout\": [0.0, 0.2],\n",
    "}\n",
    "model_config_category_embedding = CategoryEmbeddingModelConfig(task=\"classification\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "e061a1c9-21ba-49ce-aa9f-f1caadf5dc0c",
   "metadata": {},
   "outputs": [],
   "source": [
    "search_space_ft_transformer = {\n",
    "    \"optimizer_config__optimizer\": [\"Adam\", \"SGD\"],\n",
    "    \"model_config__input_embed_dim\": [32, 64],\n",
    "    \"model_config__num_attn_blocks\": [3, 6, 8],\n",
    "    \"model_config__ff_hidden_multiplier\": [4, 8],\n",
    "    \"model_config__transformer_activation\": [\"GEGLU\", \"LeakyReLU\"],\n",
    "    \"model_config__embedding_dropout\": [0.0, 0.2],\n",
    "}\n",
    "model_config_ft_transformer = FTTransformerConfig(task=\"classification\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c7351fb5-4044-4eeb-b6ce-0c8b70d26fda",
   "metadata": {},
   "source": [
    "Let's add all search spaces and model configs in list.\n",
    "\n",
    "**Important** They must be in the same order and same length"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "a0420e5c-dfec-4765-bdab-9dc2b6a58731",
   "metadata": {},
   "outputs": [],
   "source": [
    "search_spaces = [search_space_category_embedding, search_space_ft_transformer]\n",
    "model_configs = [model_config_category_embedding, model_config_ft_transformer]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "6a8d0a07-3d90-438d-90b8-f0e168733f47",
   "metadata": {},
   "outputs": [],
   "source": [
    "from pytorch_tabular.tabular_model_tuner import TabularModelTuner"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "54bae6b5-a3d7-468a-b51d-6da713cf65ac",
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "application/vnd.jupyter.widget-view+json": {
       "model_id": "08d675a526e143bfb264bb6b1a3a441b",
       "version_major": 2,
       "version_minor": 0
      },
      "text/plain": [
       "Output()"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"></pre>\n"
      ],
      "text/plain": []
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">\n",
       "</pre>\n"
      ],
      "text/plain": [
       "\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "tuner = TabularModelTuner(\n",
    "    data_config=data_config,\n",
    "    model_config=model_configs,\n",
    "    optimizer_config=optimizer_config,\n",
    "    trainer_config=trainer_config\n",
    ")\n",
    "with warnings.catch_warnings():\n",
    "    warnings.simplefilter(\"ignore\")\n",
    "    tuner_df = tuner.tune(\n",
    "        train=train,\n",
    "        validation=valid,\n",
    "        search_space=search_spaces,\n",
    "        strategy=\"grid_search\",  # random_search\n",
    "        # n_trials=5,\n",
    "        metric=\"accuracy\",\n",
    "        mode=\"max\",\n",
    "        progress_bar=True,\n",
    "        verbose=False # Make True if you want to log metrics and params each trial\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "52946631-95d0-45f3-a2d8-cd5b14ddada4",
   "metadata": {},
   "source": [
    "Nice!!! We now know the best architecture and possible hyperparameters for our dataset. Maybe the result is not good enough, but at least will reduce the options. With these results, we will know better which are the best hyperparameters that can be better explored and others that do not make sense to continue using.\n",
    "\n",
    "It is even a good idea to explore the architecture paper so that, who knows, it can guide you further towards the best hyperparameters."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "2d60562a-93fd-4bfb-89e3-98ef09851f8d",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<style type=\"text/css\">\n",
       "#T_8e976_row0_col6, #T_8e976_row0_col7 {\n",
       "  background-color: #006837;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row1_col6 {\n",
       "  background-color: #15904c;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row1_col7 {\n",
       "  background-color: #18954f;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row2_col6 {\n",
       "  background-color: #118848;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row2_col7 {\n",
       "  background-color: #199750;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row3_col6 {\n",
       "  background-color: #42ac5a;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row3_col7 {\n",
       "  background-color: #1b9950;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row4_col6, #T_8e976_row6_col6, #T_8e976_row9_col6, #T_8e976_row27_col7, #T_8e976_row28_col7 {\n",
       "  background-color: #7fc866;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row4_col7, #T_8e976_row5_col7 {\n",
       "  background-color: #1e9a51;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row5_col6, #T_8e976_row29_col7 {\n",
       "  background-color: #82c966;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row6_col7 {\n",
       "  background-color: #249d53;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row7_col6 {\n",
       "  background-color: #33a456;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row7_col7 {\n",
       "  background-color: #39a758;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row8_col6 {\n",
       "  background-color: #8ecf67;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row8_col7 {\n",
       "  background-color: #57b65f;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row9_col7 {\n",
       "  background-color: #5ab760;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row10_col6, #T_8e976_row50_col7 {\n",
       "  background-color: #b7e075;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row10_col7, #T_8e976_row11_col7, #T_8e976_row12_col7 {\n",
       "  background-color: #60ba62;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row11_col6 {\n",
       "  background-color: #93d168;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row12_col6, #T_8e976_row31_col6, #T_8e976_row38_col7, #T_8e976_row39_col7, #T_8e976_row40_col7, #T_8e976_row41_col7, #T_8e976_row42_col7, #T_8e976_row43_col7, #T_8e976_row50_col6 {\n",
       "  background-color: #a5d86a;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row13_col6, #T_8e976_row45_col7 {\n",
       "  background-color: #abdb6d;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row13_col7 {\n",
       "  background-color: #63bc62;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row14_col6, #T_8e976_row23_col6 {\n",
       "  background-color: #cdea83;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row14_col7 {\n",
       "  background-color: #66bd63;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row15_col6, #T_8e976_row30_col6, #T_8e976_row32_col6, #T_8e976_row33_col6, #T_8e976_row64_col7 {\n",
       "  background-color: #cbe982;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row15_col7, #T_8e976_row16_col7 {\n",
       "  background-color: #6bbf64;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row16_col6, #T_8e976_row20_col6, #T_8e976_row25_col6, #T_8e976_row49_col7 {\n",
       "  background-color: #b3df72;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row17_col6, #T_8e976_row36_col6 {\n",
       "  background-color: #c1e57b;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row17_col7, #T_8e976_row18_col7 {\n",
       "  background-color: #6ec064;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row18_col6, #T_8e976_row19_col6, #T_8e976_row41_col6, #T_8e976_row55_col6, #T_8e976_row70_col7 {\n",
       "  background-color: #dcf08f;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row19_col7 {\n",
       "  background-color: #70c164;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row20_col7, #T_8e976_row21_col7 {\n",
       "  background-color: #73c264;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row21_col6, #T_8e976_row57_col6, #T_8e976_row59_col6 {\n",
       "  background-color: #fff3ac;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row22_col6 {\n",
       "  background-color: #b5df74;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row22_col7, #T_8e976_row23_col7, #T_8e976_row24_col7 {\n",
       "  background-color: #75c465;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row24_col6 {\n",
       "  background-color: #d5ed88;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row25_col7 {\n",
       "  background-color: #78c565;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row26_col6 {\n",
       "  background-color: #afdd70;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row26_col7 {\n",
       "  background-color: #7ac665;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row27_col6, #T_8e976_row28_col6 {\n",
       "  background-color: #e6f59d;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row29_col6, #T_8e976_row37_col7 {\n",
       "  background-color: #98d368;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row30_col7, #T_8e976_row31_col7 {\n",
       "  background-color: #84ca66;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row32_col7 {\n",
       "  background-color: #87cb67;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row33_col7, #T_8e976_row34_col7 {\n",
       "  background-color: #8ccd67;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row34_col6, #T_8e976_row51_col7, #T_8e976_row52_col7, #T_8e976_row63_col6 {\n",
       "  background-color: #b9e176;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row35_col6, #T_8e976_row92_col7, #T_8e976_row93_col7 {\n",
       "  background-color: #fffebe;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row35_col7, #T_8e976_row36_col7 {\n",
       "  background-color: #96d268;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row37_col6, #T_8e976_row72_col7 {\n",
       "  background-color: #e3f399;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row38_col6, #T_8e976_row65_col7 {\n",
       "  background-color: #cfeb85;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row39_col6, #T_8e976_row66_col7 {\n",
       "  background-color: #d3ec87;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row40_col6, #T_8e976_row44_col7 {\n",
       "  background-color: #a9da6c;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row42_col6, #T_8e976_row47_col6, #T_8e976_row76_col7, #T_8e976_row77_col7, #T_8e976_row78_col7, #T_8e976_row79_col7 {\n",
       "  background-color: #ecf7a6;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row43_col6 {\n",
       "  background-color: #f2faae;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row44_col6, #T_8e976_row74_col7 {\n",
       "  background-color: #e8f59f;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row45_col6, #T_8e976_row64_col6, #T_8e976_row73_col6 {\n",
       "  background-color: #fff1a8;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row46_col6, #T_8e976_row54_col6 {\n",
       "  background-color: #d1ec86;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row46_col7 {\n",
       "  background-color: #addc6f;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row47_col7, #T_8e976_row48_col7 {\n",
       "  background-color: #b1de71;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row48_col6, #T_8e976_row75_col7 {\n",
       "  background-color: #ebf7a3;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row49_col6, #T_8e976_row52_col6 {\n",
       "  background-color: #dff293;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row51_col6, #T_8e976_row61_col6, #T_8e976_row85_col7, #T_8e976_row86_col7, #T_8e976_row87_col7 {\n",
       "  background-color: #f7fcb4;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row53_col6, #T_8e976_row111_col7 {\n",
       "  background-color: #fee999;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row53_col7 {\n",
       "  background-color: #bbe278;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row54_col7, #T_8e976_row55_col7 {\n",
       "  background-color: #bde379;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row56_col6, #T_8e976_row58_col6, #T_8e976_row100_col7 {\n",
       "  background-color: #fff7b2;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row56_col7, #T_8e976_row57_col7, #T_8e976_row58_col7 {\n",
       "  background-color: #c3e67d;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row59_col7 {\n",
       "  background-color: #c5e67e;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row60_col6 {\n",
       "  background-color: #fdfebc;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row60_col7, #T_8e976_row61_col7, #T_8e976_row62_col7 {\n",
       "  background-color: #c7e77f;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row62_col6, #T_8e976_row89_col7 {\n",
       "  background-color: #fbfdba;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row63_col7 {\n",
       "  background-color: #c9e881;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row65_col6 {\n",
       "  background-color: #fff6b0;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row66_col6, #T_8e976_row88_col7 {\n",
       "  background-color: #fafdb8;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row67_col6, #T_8e976_row94_col7 {\n",
       "  background-color: #fffdbc;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row67_col7, #T_8e976_row68_col7, #T_8e976_row69_col7 {\n",
       "  background-color: #daf08d;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row68_col6, #T_8e976_row106_col7 {\n",
       "  background-color: #fff0a6;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row69_col6, #T_8e976_row92_col6 {\n",
       "  background-color: #fee593;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row70_col6, #T_8e976_row74_col6, #T_8e976_row108_col7 {\n",
       "  background-color: #feeda1;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row71_col6, #T_8e976_row106_col6 {\n",
       "  background-color: #fa9857;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row71_col7 {\n",
       "  background-color: #e0f295;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row72_col6 {\n",
       "  background-color: #fed481;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row73_col7 {\n",
       "  background-color: #e5f49b;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row75_col6 {\n",
       "  background-color: #fee695;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row76_col6 {\n",
       "  background-color: #fee491;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row77_col6 {\n",
       "  background-color: #fdc171;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row78_col6 {\n",
       "  background-color: #fdb567;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row79_col6 {\n",
       "  background-color: #fed27f;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row80_col6, #T_8e976_row98_col6, #T_8e976_row108_col6 {\n",
       "  background-color: #fdbd6d;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row80_col7, #T_8e976_row81_col7 {\n",
       "  background-color: #eef8a8;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row81_col6, #T_8e976_row91_col6 {\n",
       "  background-color: #fec877;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row82_col6, #T_8e976_row83_col6 {\n",
       "  background-color: #fdbb6c;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row82_col7, #T_8e976_row83_col7, #T_8e976_row84_col7 {\n",
       "  background-color: #f5fbb2;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row84_col6 {\n",
       "  background-color: #fed683;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row85_col6 {\n",
       "  background-color: #f88c51;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row86_col6 {\n",
       "  background-color: #fdc372;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row87_col6, #T_8e976_row88_col6, #T_8e976_row100_col6 {\n",
       "  background-color: #fdb163;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row89_col6 {\n",
       "  background-color: #fdb96a;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row90_col6, #T_8e976_row104_col6, #T_8e976_row111_col6 {\n",
       "  background-color: #fcaa5f;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row90_col7, #T_8e976_row91_col7 {\n",
       "  background-color: #feffbe;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row93_col6, #T_8e976_row103_col6, #T_8e976_row112_col6 {\n",
       "  background-color: #f99355;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row94_col6, #T_8e976_row110_col6 {\n",
       "  background-color: #fdb768;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row95_col6, #T_8e976_row109_col6 {\n",
       "  background-color: #fdc574;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row95_col7, #T_8e976_row96_col7, #T_8e976_row97_col7 {\n",
       "  background-color: #fffcba;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row96_col6 {\n",
       "  background-color: #fece7c;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row97_col6 {\n",
       "  background-color: #fba35c;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row98_col7 {\n",
       "  background-color: #fffab6;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row99_col6, #T_8e976_row115_col6 {\n",
       "  background-color: #fb9d59;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row99_col7 {\n",
       "  background-color: #fff8b4;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row101_col6 {\n",
       "  background-color: #f99153;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row101_col7 {\n",
       "  background-color: #fff5ae;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row102_col6 {\n",
       "  background-color: #fa9656;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row102_col7, #T_8e976_row103_col7, #T_8e976_row104_col7, #T_8e976_row105_col7 {\n",
       "  background-color: #fff2aa;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row105_col6 {\n",
       "  background-color: #f88950;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row107_col6 {\n",
       "  background-color: #fdbf6f;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row107_col7 {\n",
       "  background-color: #feefa3;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row109_col7, #T_8e976_row110_col7 {\n",
       "  background-color: #feea9b;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row112_col7 {\n",
       "  background-color: #fee797;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row113_col6, #T_8e976_row121_col7 {\n",
       "  background-color: #fba05b;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row113_col7, #T_8e976_row114_col7 {\n",
       "  background-color: #fee18d;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row114_col6 {\n",
       "  background-color: #fca85e;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row115_col7 {\n",
       "  background-color: #feda86;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row116_col6, #T_8e976_row118_col6 {\n",
       "  background-color: #fdad60;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row116_col7 {\n",
       "  background-color: #fed884;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row117_col6, #T_8e976_row124_col7, #T_8e976_row125_col7 {\n",
       "  background-color: #e95538;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row117_col7, #T_8e976_row118_col7 {\n",
       "  background-color: #fed07e;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row119_col6 {\n",
       "  background-color: #e24731;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row119_col7 {\n",
       "  background-color: #fdc776;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row120_col6 {\n",
       "  background-color: #a70226;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row120_col7 {\n",
       "  background-color: #fca55d;\n",
       "  color: #000000;\n",
       "}\n",
       "#T_8e976_row121_col6, #T_8e976_row122_col6 {\n",
       "  background-color: #f47044;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row122_col7, #T_8e976_row123_col7 {\n",
       "  background-color: #f98e52;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row123_col6 {\n",
       "  background-color: #dd3d2d;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row124_col6 {\n",
       "  background-color: #ed5f3c;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row125_col6 {\n",
       "  background-color: #f67f4b;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row126_col6 {\n",
       "  background-color: #f36b42;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row126_col7 {\n",
       "  background-color: #e34933;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "#T_8e976_row127_col6, #T_8e976_row127_col7 {\n",
       "  background-color: #a50026;\n",
       "  color: #f1f1f1;\n",
       "}\n",
       "</style>\n",
       "<table id=\"T_8e976\">\n",
       "  <thead>\n",
       "    <tr>\n",
       "      <th class=\"blank level0\" >&nbsp;</th>\n",
       "      <th id=\"T_8e976_level0_col0\" class=\"col_heading level0 col0\" >trial_id</th>\n",
       "      <th id=\"T_8e976_level0_col1\" class=\"col_heading level0 col1\" >model</th>\n",
       "      <th id=\"T_8e976_level0_col2\" class=\"col_heading level0 col2\" >model_config__activation</th>\n",
       "      <th id=\"T_8e976_level0_col3\" class=\"col_heading level0 col3\" >model_config__embedding_dropout</th>\n",
       "      <th id=\"T_8e976_level0_col4\" class=\"col_heading level0 col4\" >model_config__layers</th>\n",
       "      <th id=\"T_8e976_level0_col5\" class=\"col_heading level0 col5\" >optimizer_config__optimizer</th>\n",
       "      <th id=\"T_8e976_level0_col6\" class=\"col_heading level0 col6\" >loss</th>\n",
       "      <th id=\"T_8e976_level0_col7\" class=\"col_heading level0 col7\" >accuracy</th>\n",
       "      <th id=\"T_8e976_level0_col8\" class=\"col_heading level0 col8\" >model_config__ff_hidden_multiplier</th>\n",
       "      <th id=\"T_8e976_level0_col9\" class=\"col_heading level0 col9\" >model_config__input_embed_dim</th>\n",
       "      <th id=\"T_8e976_level0_col10\" class=\"col_heading level0 col10\" >model_config__num_attn_blocks</th>\n",
       "      <th id=\"T_8e976_level0_col11\" class=\"col_heading level0 col11\" >model_config__transformer_activation</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row0\" class=\"row_heading level0 row0\" >22</th>\n",
       "      <td id=\"T_8e976_row0_col0\" class=\"data row0 col0\" >22</td>\n",
       "      <td id=\"T_8e976_row0_col1\" class=\"data row0 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row0_col2\" class=\"data row0 col2\" >LeakyReLU</td>\n",
       "      <td id=\"T_8e976_row0_col3\" class=\"data row0 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row0_col4\" class=\"data row0 col4\" >256-512-1024</td>\n",
       "      <td id=\"T_8e976_row0_col5\" class=\"data row0 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row0_col6\" class=\"data row0 col6\" >0.339012</td>\n",
       "      <td id=\"T_8e976_row0_col7\" class=\"data row0 col7\" >0.857904</td>\n",
       "      <td id=\"T_8e976_row0_col8\" class=\"data row0 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row0_col9\" class=\"data row0 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row0_col10\" class=\"data row0 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row0_col11\" class=\"data row0 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row1\" class=\"row_heading level0 row1\" >26</th>\n",
       "      <td id=\"T_8e976_row1_col0\" class=\"data row1 col0\" >26</td>\n",
       "      <td id=\"T_8e976_row1_col1\" class=\"data row1 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row1_col2\" class=\"data row1 col2\" >LeakyReLU</td>\n",
       "      <td id=\"T_8e976_row1_col3\" class=\"data row1 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row1_col4\" class=\"data row1 col4\" >1024-512-256</td>\n",
       "      <td id=\"T_8e976_row1_col5\" class=\"data row1 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row1_col6\" class=\"data row1 col6\" >0.375515</td>\n",
       "      <td id=\"T_8e976_row1_col7\" class=\"data row1 col7\" >0.817052</td>\n",
       "      <td id=\"T_8e976_row1_col8\" class=\"data row1 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row1_col9\" class=\"data row1 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row1_col10\" class=\"data row1 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row1_col11\" class=\"data row1 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row2\" class=\"row_heading level0 row2\" >20</th>\n",
       "      <td id=\"T_8e976_row2_col0\" class=\"data row2 col0\" >20</td>\n",
       "      <td id=\"T_8e976_row2_col1\" class=\"data row2 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row2_col2\" class=\"data row2 col2\" >LeakyReLU</td>\n",
       "      <td id=\"T_8e976_row2_col3\" class=\"data row2 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row2_col4\" class=\"data row2 col4\" >32-64-128</td>\n",
       "      <td id=\"T_8e976_row2_col5\" class=\"data row2 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row2_col6\" class=\"data row2 col6\" >0.368664</td>\n",
       "      <td id=\"T_8e976_row2_col7\" class=\"data row2 col7\" >0.815275</td>\n",
       "      <td id=\"T_8e976_row2_col8\" class=\"data row2 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row2_col9\" class=\"data row2 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row2_col10\" class=\"data row2 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row2_col11\" class=\"data row2 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row3\" class=\"row_heading level0 row3\" >2</th>\n",
       "      <td id=\"T_8e976_row3_col0\" class=\"data row3 col0\" >2</td>\n",
       "      <td id=\"T_8e976_row3_col1\" class=\"data row3 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row3_col2\" class=\"data row3 col2\" >ReLU</td>\n",
       "      <td id=\"T_8e976_row3_col3\" class=\"data row3 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row3_col4\" class=\"data row3 col4\" >1024-512-256</td>\n",
       "      <td id=\"T_8e976_row3_col5\" class=\"data row3 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row3_col6\" class=\"data row3 col6\" >0.407023</td>\n",
       "      <td id=\"T_8e976_row3_col7\" class=\"data row3 col7\" >0.813499</td>\n",
       "      <td id=\"T_8e976_row3_col8\" class=\"data row3 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row3_col9\" class=\"data row3 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row3_col10\" class=\"data row3 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row3_col11\" class=\"data row3 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row4\" class=\"row_heading level0 row4\" >6</th>\n",
       "      <td id=\"T_8e976_row4_col0\" class=\"data row4 col0\" >6</td>\n",
       "      <td id=\"T_8e976_row4_col1\" class=\"data row4 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row4_col2\" class=\"data row4 col2\" >ReLU</td>\n",
       "      <td id=\"T_8e976_row4_col3\" class=\"data row4 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row4_col4\" class=\"data row4 col4\" >256-512-1024</td>\n",
       "      <td id=\"T_8e976_row4_col5\" class=\"data row4 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row4_col6\" class=\"data row4 col6\" >0.445294</td>\n",
       "      <td id=\"T_8e976_row4_col7\" class=\"data row4 col7\" >0.811723</td>\n",
       "      <td id=\"T_8e976_row4_col8\" class=\"data row4 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row4_col9\" class=\"data row4 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row4_col10\" class=\"data row4 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row4_col11\" class=\"data row4 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row5\" class=\"row_heading level0 row5\" >10</th>\n",
       "      <td id=\"T_8e976_row5_col0\" class=\"data row5 col0\" >10</td>\n",
       "      <td id=\"T_8e976_row5_col1\" class=\"data row5 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row5_col2\" class=\"data row5 col2\" >ReLU</td>\n",
       "      <td id=\"T_8e976_row5_col3\" class=\"data row5 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row5_col4\" class=\"data row5 col4\" >1024-512-256</td>\n",
       "      <td id=\"T_8e976_row5_col5\" class=\"data row5 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row5_col6\" class=\"data row5 col6\" >0.446737</td>\n",
       "      <td id=\"T_8e976_row5_col7\" class=\"data row5 col7\" >0.811723</td>\n",
       "      <td id=\"T_8e976_row5_col8\" class=\"data row5 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row5_col9\" class=\"data row5 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row5_col10\" class=\"data row5 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row5_col11\" class=\"data row5 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row6\" class=\"row_heading level0 row6\" >18</th>\n",
       "      <td id=\"T_8e976_row6_col0\" class=\"data row6 col0\" >18</td>\n",
       "      <td id=\"T_8e976_row6_col1\" class=\"data row6 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row6_col2\" class=\"data row6 col2\" >LeakyReLU</td>\n",
       "      <td id=\"T_8e976_row6_col3\" class=\"data row6 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row6_col4\" class=\"data row6 col4\" >1024-512-256</td>\n",
       "      <td id=\"T_8e976_row6_col5\" class=\"data row6 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row6_col6\" class=\"data row6 col6\" >0.444420</td>\n",
       "      <td id=\"T_8e976_row6_col7\" class=\"data row6 col7\" >0.808170</td>\n",
       "      <td id=\"T_8e976_row6_col8\" class=\"data row6 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row6_col9\" class=\"data row6 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row6_col10\" class=\"data row6 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row6_col11\" class=\"data row6 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row7\" class=\"row_heading level0 row7\" >30</th>\n",
       "      <td id=\"T_8e976_row7_col0\" class=\"data row7 col0\" >30</td>\n",
       "      <td id=\"T_8e976_row7_col1\" class=\"data row7 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row7_col2\" class=\"data row7 col2\" >LeakyReLU</td>\n",
       "      <td id=\"T_8e976_row7_col3\" class=\"data row7 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row7_col4\" class=\"data row7 col4\" >256-512-1024</td>\n",
       "      <td id=\"T_8e976_row7_col5\" class=\"data row7 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row7_col6\" class=\"data row7 col6\" >0.398530</td>\n",
       "      <td id=\"T_8e976_row7_col7\" class=\"data row7 col7\" >0.797513</td>\n",
       "      <td id=\"T_8e976_row7_col8\" class=\"data row7 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row7_col9\" class=\"data row7 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row7_col10\" class=\"data row7 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row7_col11\" class=\"data row7 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row8\" class=\"row_heading level0 row8\" >14</th>\n",
       "      <td id=\"T_8e976_row8_col0\" class=\"data row8 col0\" >14</td>\n",
       "      <td id=\"T_8e976_row8_col1\" class=\"data row8 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row8_col2\" class=\"data row8 col2\" >ReLU</td>\n",
       "      <td id=\"T_8e976_row8_col3\" class=\"data row8 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row8_col4\" class=\"data row8 col4\" >256-512-1024</td>\n",
       "      <td id=\"T_8e976_row8_col5\" class=\"data row8 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row8_col6\" class=\"data row8 col6\" >0.455243</td>\n",
       "      <td id=\"T_8e976_row8_col7\" class=\"data row8 col7\" >0.781528</td>\n",
       "      <td id=\"T_8e976_row8_col8\" class=\"data row8 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row8_col9\" class=\"data row8 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row8_col10\" class=\"data row8 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row8_col11\" class=\"data row8 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row9\" class=\"row_heading level0 row9\" >72</th>\n",
       "      <td id=\"T_8e976_row9_col0\" class=\"data row9 col0\" >40</td>\n",
       "      <td id=\"T_8e976_row9_col1\" class=\"data row9 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row9_col2\" class=\"data row9 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row9_col3\" class=\"data row9 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row9_col4\" class=\"data row9 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row9_col5\" class=\"data row9 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row9_col6\" class=\"data row9 col6\" >0.445089</td>\n",
       "      <td id=\"T_8e976_row9_col7\" class=\"data row9 col7\" >0.779751</td>\n",
       "      <td id=\"T_8e976_row9_col8\" class=\"data row9 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row9_col9\" class=\"data row9 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row9_col10\" class=\"data row9 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row9_col11\" class=\"data row9 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row10\" class=\"row_heading level0 row10\" >8</th>\n",
       "      <td id=\"T_8e976_row10_col0\" class=\"data row10 col0\" >8</td>\n",
       "      <td id=\"T_8e976_row10_col1\" class=\"data row10 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row10_col2\" class=\"data row10 col2\" >ReLU</td>\n",
       "      <td id=\"T_8e976_row10_col3\" class=\"data row10 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row10_col4\" class=\"data row10 col4\" >128-64-32</td>\n",
       "      <td id=\"T_8e976_row10_col5\" class=\"data row10 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row10_col6\" class=\"data row10 col6\" >0.486341</td>\n",
       "      <td id=\"T_8e976_row10_col7\" class=\"data row10 col7\" >0.776199</td>\n",
       "      <td id=\"T_8e976_row10_col8\" class=\"data row10 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row10_col9\" class=\"data row10 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row10_col10\" class=\"data row10 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row10_col11\" class=\"data row10 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row11\" class=\"row_heading level0 row11\" >16</th>\n",
       "      <td id=\"T_8e976_row11_col0\" class=\"data row11 col0\" >16</td>\n",
       "      <td id=\"T_8e976_row11_col1\" class=\"data row11 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row11_col2\" class=\"data row11 col2\" >LeakyReLU</td>\n",
       "      <td id=\"T_8e976_row11_col3\" class=\"data row11 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row11_col4\" class=\"data row11 col4\" >128-64-32</td>\n",
       "      <td id=\"T_8e976_row11_col5\" class=\"data row11 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row11_col6\" class=\"data row11 col6\" >0.458817</td>\n",
       "      <td id=\"T_8e976_row11_col7\" class=\"data row11 col7\" >0.776199</td>\n",
       "      <td id=\"T_8e976_row11_col8\" class=\"data row11 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row11_col9\" class=\"data row11 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row11_col10\" class=\"data row11 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row11_col11\" class=\"data row11 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row12\" class=\"row_heading level0 row12\" >116</th>\n",
       "      <td id=\"T_8e976_row12_col0\" class=\"data row12 col0\" >84</td>\n",
       "      <td id=\"T_8e976_row12_col1\" class=\"data row12 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row12_col2\" class=\"data row12 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row12_col3\" class=\"data row12 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row12_col4\" class=\"data row12 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row12_col5\" class=\"data row12 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row12_col6\" class=\"data row12 col6\" >0.471312</td>\n",
       "      <td id=\"T_8e976_row12_col7\" class=\"data row12 col7\" >0.776199</td>\n",
       "      <td id=\"T_8e976_row12_col8\" class=\"data row12 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row12_col9\" class=\"data row12 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row12_col10\" class=\"data row12 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row12_col11\" class=\"data row12 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row13\" class=\"row_heading level0 row13\" >62</th>\n",
       "      <td id=\"T_8e976_row13_col0\" class=\"data row13 col0\" >30</td>\n",
       "      <td id=\"T_8e976_row13_col1\" class=\"data row13 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row13_col2\" class=\"data row13 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row13_col3\" class=\"data row13 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row13_col4\" class=\"data row13 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row13_col5\" class=\"data row13 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row13_col6\" class=\"data row13 col6\" >0.475959</td>\n",
       "      <td id=\"T_8e976_row13_col7\" class=\"data row13 col7\" >0.774423</td>\n",
       "      <td id=\"T_8e976_row13_col8\" class=\"data row13 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row13_col9\" class=\"data row13 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row13_col10\" class=\"data row13 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row13_col11\" class=\"data row13 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row14\" class=\"row_heading level0 row14\" >36</th>\n",
       "      <td id=\"T_8e976_row14_col0\" class=\"data row14 col0\" >4</td>\n",
       "      <td id=\"T_8e976_row14_col1\" class=\"data row14 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row14_col2\" class=\"data row14 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row14_col3\" class=\"data row14 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row14_col4\" class=\"data row14 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row14_col5\" class=\"data row14 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row14_col6\" class=\"data row14 col6\" >0.506062</td>\n",
       "      <td id=\"T_8e976_row14_col7\" class=\"data row14 col7\" >0.772647</td>\n",
       "      <td id=\"T_8e976_row14_col8\" class=\"data row14 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row14_col9\" class=\"data row14 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row14_col10\" class=\"data row14 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row14_col11\" class=\"data row14 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row15\" class=\"row_heading level0 row15\" >28</th>\n",
       "      <td id=\"T_8e976_row15_col0\" class=\"data row15 col0\" >28</td>\n",
       "      <td id=\"T_8e976_row15_col1\" class=\"data row15 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row15_col2\" class=\"data row15 col2\" >LeakyReLU</td>\n",
       "      <td id=\"T_8e976_row15_col3\" class=\"data row15 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row15_col4\" class=\"data row15 col4\" >32-64-128</td>\n",
       "      <td id=\"T_8e976_row15_col5\" class=\"data row15 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row15_col6\" class=\"data row15 col6\" >0.503373</td>\n",
       "      <td id=\"T_8e976_row15_col7\" class=\"data row15 col7\" >0.769094</td>\n",
       "      <td id=\"T_8e976_row15_col8\" class=\"data row15 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row15_col9\" class=\"data row15 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row15_col10\" class=\"data row15 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row15_col11\" class=\"data row15 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row16\" class=\"row_heading level0 row16\" >0</th>\n",
       "      <td id=\"T_8e976_row16_col0\" class=\"data row16 col0\" >0</td>\n",
       "      <td id=\"T_8e976_row16_col1\" class=\"data row16 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row16_col2\" class=\"data row16 col2\" >ReLU</td>\n",
       "      <td id=\"T_8e976_row16_col3\" class=\"data row16 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row16_col4\" class=\"data row16 col4\" >128-64-32</td>\n",
       "      <td id=\"T_8e976_row16_col5\" class=\"data row16 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row16_col6\" class=\"data row16 col6\" >0.482425</td>\n",
       "      <td id=\"T_8e976_row16_col7\" class=\"data row16 col7\" >0.769094</td>\n",
       "      <td id=\"T_8e976_row16_col8\" class=\"data row16 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row16_col9\" class=\"data row16 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row16_col10\" class=\"data row16 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row16_col11\" class=\"data row16 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row17\" class=\"row_heading level0 row17\" >60</th>\n",
       "      <td id=\"T_8e976_row17_col0\" class=\"data row17 col0\" >28</td>\n",
       "      <td id=\"T_8e976_row17_col1\" class=\"data row17 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row17_col2\" class=\"data row17 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row17_col3\" class=\"data row17 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row17_col4\" class=\"data row17 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row17_col5\" class=\"data row17 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row17_col6\" class=\"data row17 col6\" >0.495479</td>\n",
       "      <td id=\"T_8e976_row17_col7\" class=\"data row17 col7\" >0.767318</td>\n",
       "      <td id=\"T_8e976_row17_col8\" class=\"data row17 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row17_col9\" class=\"data row17 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row17_col10\" class=\"data row17 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row17_col11\" class=\"data row17 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row18\" class=\"row_heading level0 row18\" >56</th>\n",
       "      <td id=\"T_8e976_row18_col0\" class=\"data row18 col0\" >24</td>\n",
       "      <td id=\"T_8e976_row18_col1\" class=\"data row18 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row18_col2\" class=\"data row18 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row18_col3\" class=\"data row18 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row18_col4\" class=\"data row18 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row18_col5\" class=\"data row18 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row18_col6\" class=\"data row18 col6\" >0.519672</td>\n",
       "      <td id=\"T_8e976_row18_col7\" class=\"data row18 col7\" >0.767318</td>\n",
       "      <td id=\"T_8e976_row18_col8\" class=\"data row18 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row18_col9\" class=\"data row18 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row18_col10\" class=\"data row18 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row18_col11\" class=\"data row18 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row19\" class=\"row_heading level0 row19\" >80</th>\n",
       "      <td id=\"T_8e976_row19_col0\" class=\"data row19 col0\" >48</td>\n",
       "      <td id=\"T_8e976_row19_col1\" class=\"data row19 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row19_col2\" class=\"data row19 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row19_col3\" class=\"data row19 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row19_col4\" class=\"data row19 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row19_col5\" class=\"data row19 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row19_col6\" class=\"data row19 col6\" >0.518865</td>\n",
       "      <td id=\"T_8e976_row19_col7\" class=\"data row19 col7\" >0.765542</td>\n",
       "      <td id=\"T_8e976_row19_col8\" class=\"data row19 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row19_col9\" class=\"data row19 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row19_col10\" class=\"data row19 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row19_col11\" class=\"data row19 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row20\" class=\"row_heading level0 row20\" >74</th>\n",
       "      <td id=\"T_8e976_row20_col0\" class=\"data row20 col0\" >42</td>\n",
       "      <td id=\"T_8e976_row20_col1\" class=\"data row20 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row20_col2\" class=\"data row20 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row20_col3\" class=\"data row20 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row20_col4\" class=\"data row20 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row20_col5\" class=\"data row20 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row20_col6\" class=\"data row20 col6\" >0.483879</td>\n",
       "      <td id=\"T_8e976_row20_col7\" class=\"data row20 col7\" >0.763766</td>\n",
       "      <td id=\"T_8e976_row20_col8\" class=\"data row20 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row20_col9\" class=\"data row20 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row20_col10\" class=\"data row20 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row20_col11\" class=\"data row20 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row21\" class=\"row_heading level0 row21\" >64</th>\n",
       "      <td id=\"T_8e976_row21_col0\" class=\"data row21 col0\" >32</td>\n",
       "      <td id=\"T_8e976_row21_col1\" class=\"data row21 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row21_col2\" class=\"data row21 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row21_col3\" class=\"data row21 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row21_col4\" class=\"data row21 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row21_col5\" class=\"data row21 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row21_col6\" class=\"data row21 col6\" >0.575869</td>\n",
       "      <td id=\"T_8e976_row21_col7\" class=\"data row21 col7\" >0.763766</td>\n",
       "      <td id=\"T_8e976_row21_col8\" class=\"data row21 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row21_col9\" class=\"data row21 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row21_col10\" class=\"data row21 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row21_col11\" class=\"data row21 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row22\" class=\"row_heading level0 row22\" >94</th>\n",
       "      <td id=\"T_8e976_row22_col0\" class=\"data row22 col0\" >62</td>\n",
       "      <td id=\"T_8e976_row22_col1\" class=\"data row22 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row22_col2\" class=\"data row22 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row22_col3\" class=\"data row22 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row22_col4\" class=\"data row22 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row22_col5\" class=\"data row22 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row22_col6\" class=\"data row22 col6\" >0.484891</td>\n",
       "      <td id=\"T_8e976_row22_col7\" class=\"data row22 col7\" >0.761989</td>\n",
       "      <td id=\"T_8e976_row22_col8\" class=\"data row22 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row22_col9\" class=\"data row22 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row22_col10\" class=\"data row22 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row22_col11\" class=\"data row22 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row23\" class=\"row_heading level0 row23\" >66</th>\n",
       "      <td id=\"T_8e976_row23_col0\" class=\"data row23 col0\" >34</td>\n",
       "      <td id=\"T_8e976_row23_col1\" class=\"data row23 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row23_col2\" class=\"data row23 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row23_col3\" class=\"data row23 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row23_col4\" class=\"data row23 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row23_col5\" class=\"data row23 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row23_col6\" class=\"data row23 col6\" >0.506116</td>\n",
       "      <td id=\"T_8e976_row23_col7\" class=\"data row23 col7\" >0.761989</td>\n",
       "      <td id=\"T_8e976_row23_col8\" class=\"data row23 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row23_col9\" class=\"data row23 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row23_col10\" class=\"data row23 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row23_col11\" class=\"data row23 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row24\" class=\"row_heading level0 row24\" >52</th>\n",
       "      <td id=\"T_8e976_row24_col0\" class=\"data row24 col0\" >20</td>\n",
       "      <td id=\"T_8e976_row24_col1\" class=\"data row24 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row24_col2\" class=\"data row24 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row24_col3\" class=\"data row24 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row24_col4\" class=\"data row24 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row24_col5\" class=\"data row24 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row24_col6\" class=\"data row24 col6\" >0.511868</td>\n",
       "      <td id=\"T_8e976_row24_col7\" class=\"data row24 col7\" >0.761989</td>\n",
       "      <td id=\"T_8e976_row24_col8\" class=\"data row24 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row24_col9\" class=\"data row24 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row24_col10\" class=\"data row24 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row24_col11\" class=\"data row24 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row25\" class=\"row_heading level0 row25\" >96</th>\n",
       "      <td id=\"T_8e976_row25_col0\" class=\"data row25 col0\" >64</td>\n",
       "      <td id=\"T_8e976_row25_col1\" class=\"data row25 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row25_col2\" class=\"data row25 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row25_col3\" class=\"data row25 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row25_col4\" class=\"data row25 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row25_col5\" class=\"data row25 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row25_col6\" class=\"data row25 col6\" >0.482814</td>\n",
       "      <td id=\"T_8e976_row25_col7\" class=\"data row25 col7\" >0.760213</td>\n",
       "      <td id=\"T_8e976_row25_col8\" class=\"data row25 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row25_col9\" class=\"data row25 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row25_col10\" class=\"data row25 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row25_col11\" class=\"data row25 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row26\" class=\"row_heading level0 row26\" >110</th>\n",
       "      <td id=\"T_8e976_row26_col0\" class=\"data row26 col0\" >78</td>\n",
       "      <td id=\"T_8e976_row26_col1\" class=\"data row26 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row26_col2\" class=\"data row26 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row26_col3\" class=\"data row26 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row26_col4\" class=\"data row26 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row26_col5\" class=\"data row26 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row26_col6\" class=\"data row26 col6\" >0.479574</td>\n",
       "      <td id=\"T_8e976_row26_col7\" class=\"data row26 col7\" >0.758437</td>\n",
       "      <td id=\"T_8e976_row26_col8\" class=\"data row26 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row26_col9\" class=\"data row26 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row26_col10\" class=\"data row26 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row26_col11\" class=\"data row26 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row27\" class=\"row_heading level0 row27\" >19</th>\n",
       "      <td id=\"T_8e976_row27_col0\" class=\"data row27 col0\" >19</td>\n",
       "      <td id=\"T_8e976_row27_col1\" class=\"data row27 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row27_col2\" class=\"data row27 col2\" >LeakyReLU</td>\n",
       "      <td id=\"T_8e976_row27_col3\" class=\"data row27 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row27_col4\" class=\"data row27 col4\" >1024-512-256</td>\n",
       "      <td id=\"T_8e976_row27_col5\" class=\"data row27 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row27_col6\" class=\"data row27 col6\" >0.532006</td>\n",
       "      <td id=\"T_8e976_row27_col7\" class=\"data row27 col7\" >0.756661</td>\n",
       "      <td id=\"T_8e976_row27_col8\" class=\"data row27 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row27_col9\" class=\"data row27 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row27_col10\" class=\"data row27 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row27_col11\" class=\"data row27 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row28\" class=\"row_heading level0 row28\" >124</th>\n",
       "      <td id=\"T_8e976_row28_col0\" class=\"data row28 col0\" >92</td>\n",
       "      <td id=\"T_8e976_row28_col1\" class=\"data row28 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row28_col2\" class=\"data row28 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row28_col3\" class=\"data row28 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row28_col4\" class=\"data row28 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row28_col5\" class=\"data row28 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row28_col6\" class=\"data row28 col6\" >0.532167</td>\n",
       "      <td id=\"T_8e976_row28_col7\" class=\"data row28 col7\" >0.756661</td>\n",
       "      <td id=\"T_8e976_row28_col8\" class=\"data row28 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row28_col9\" class=\"data row28 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row28_col10\" class=\"data row28 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row28_col11\" class=\"data row28 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row29\" class=\"row_heading level0 row29\" >86</th>\n",
       "      <td id=\"T_8e976_row29_col0\" class=\"data row29 col0\" >54</td>\n",
       "      <td id=\"T_8e976_row29_col1\" class=\"data row29 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row29_col2\" class=\"data row29 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row29_col3\" class=\"data row29 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row29_col4\" class=\"data row29 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row29_col5\" class=\"data row29 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row29_col6\" class=\"data row29 col6\" >0.462083</td>\n",
       "      <td id=\"T_8e976_row29_col7\" class=\"data row29 col7\" >0.754885</td>\n",
       "      <td id=\"T_8e976_row29_col8\" class=\"data row29 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row29_col9\" class=\"data row29 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row29_col10\" class=\"data row29 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row29_col11\" class=\"data row29 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row30\" class=\"row_heading level0 row30\" >50</th>\n",
       "      <td id=\"T_8e976_row30_col0\" class=\"data row30 col0\" >18</td>\n",
       "      <td id=\"T_8e976_row30_col1\" class=\"data row30 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row30_col2\" class=\"data row30 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row30_col3\" class=\"data row30 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row30_col4\" class=\"data row30 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row30_col5\" class=\"data row30 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row30_col6\" class=\"data row30 col6\" >0.503736</td>\n",
       "      <td id=\"T_8e976_row30_col7\" class=\"data row30 col7\" >0.753108</td>\n",
       "      <td id=\"T_8e976_row30_col8\" class=\"data row30 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row30_col9\" class=\"data row30 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row30_col10\" class=\"data row30 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row30_col11\" class=\"data row30 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row31\" class=\"row_heading level0 row31\" >42</th>\n",
       "      <td id=\"T_8e976_row31_col0\" class=\"data row31 col0\" >10</td>\n",
       "      <td id=\"T_8e976_row31_col1\" class=\"data row31 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row31_col2\" class=\"data row31 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row31_col3\" class=\"data row31 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row31_col4\" class=\"data row31 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row31_col5\" class=\"data row31 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row31_col6\" class=\"data row31 col6\" >0.470982</td>\n",
       "      <td id=\"T_8e976_row31_col7\" class=\"data row31 col7\" >0.753108</td>\n",
       "      <td id=\"T_8e976_row31_col8\" class=\"data row31 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row31_col9\" class=\"data row31 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row31_col10\" class=\"data row31 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row31_col11\" class=\"data row31 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row32\" class=\"row_heading level0 row32\" >34</th>\n",
       "      <td id=\"T_8e976_row32_col0\" class=\"data row32 col0\" >2</td>\n",
       "      <td id=\"T_8e976_row32_col1\" class=\"data row32 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row32_col2\" class=\"data row32 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row32_col3\" class=\"data row32 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row32_col4\" class=\"data row32 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row32_col5\" class=\"data row32 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row32_col6\" class=\"data row32 col6\" >0.503541</td>\n",
       "      <td id=\"T_8e976_row32_col7\" class=\"data row32 col7\" >0.751332</td>\n",
       "      <td id=\"T_8e976_row32_col8\" class=\"data row32 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row32_col9\" class=\"data row32 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row32_col10\" class=\"data row32 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row32_col11\" class=\"data row32 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row33\" class=\"row_heading level0 row33\" >106</th>\n",
       "      <td id=\"T_8e976_row33_col0\" class=\"data row33 col0\" >74</td>\n",
       "      <td id=\"T_8e976_row33_col1\" class=\"data row33 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row33_col2\" class=\"data row33 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row33_col3\" class=\"data row33 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row33_col4\" class=\"data row33 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row33_col5\" class=\"data row33 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row33_col6\" class=\"data row33 col6\" >0.504346</td>\n",
       "      <td id=\"T_8e976_row33_col7\" class=\"data row33 col7\" >0.747780</td>\n",
       "      <td id=\"T_8e976_row33_col8\" class=\"data row33 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row33_col9\" class=\"data row33 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row33_col10\" class=\"data row33 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row33_col11\" class=\"data row33 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row34\" class=\"row_heading level0 row34\" >46</th>\n",
       "      <td id=\"T_8e976_row34_col0\" class=\"data row34 col0\" >14</td>\n",
       "      <td id=\"T_8e976_row34_col1\" class=\"data row34 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row34_col2\" class=\"data row34 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row34_col3\" class=\"data row34 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row34_col4\" class=\"data row34 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row34_col5\" class=\"data row34 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row34_col6\" class=\"data row34 col6\" >0.488356</td>\n",
       "      <td id=\"T_8e976_row34_col7\" class=\"data row34 col7\" >0.747780</td>\n",
       "      <td id=\"T_8e976_row34_col8\" class=\"data row34 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row34_col9\" class=\"data row34 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row34_col10\" class=\"data row34 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row34_col11\" class=\"data row34 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row35\" class=\"row_heading level0 row35\" >54</th>\n",
       "      <td id=\"T_8e976_row35_col0\" class=\"data row35 col0\" >22</td>\n",
       "      <td id=\"T_8e976_row35_col1\" class=\"data row35 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row35_col2\" class=\"data row35 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row35_col3\" class=\"data row35 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row35_col4\" class=\"data row35 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row35_col5\" class=\"data row35 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row35_col6\" class=\"data row35 col6\" >0.561371</td>\n",
       "      <td id=\"T_8e976_row35_col7\" class=\"data row35 col7\" >0.740675</td>\n",
       "      <td id=\"T_8e976_row35_col8\" class=\"data row35 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row35_col9\" class=\"data row35 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row35_col10\" class=\"data row35 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row35_col11\" class=\"data row35 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row36\" class=\"row_heading level0 row36\" >58</th>\n",
       "      <td id=\"T_8e976_row36_col0\" class=\"data row36 col0\" >26</td>\n",
       "      <td id=\"T_8e976_row36_col1\" class=\"data row36 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row36_col2\" class=\"data row36 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row36_col3\" class=\"data row36 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row36_col4\" class=\"data row36 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row36_col5\" class=\"data row36 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row36_col6\" class=\"data row36 col6\" >0.494664</td>\n",
       "      <td id=\"T_8e976_row36_col7\" class=\"data row36 col7\" >0.740675</td>\n",
       "      <td id=\"T_8e976_row36_col8\" class=\"data row36 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row36_col9\" class=\"data row36 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row36_col10\" class=\"data row36 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row36_col11\" class=\"data row36 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row37\" class=\"row_heading level0 row37\" >88</th>\n",
       "      <td id=\"T_8e976_row37_col0\" class=\"data row37 col0\" >56</td>\n",
       "      <td id=\"T_8e976_row37_col1\" class=\"data row37 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row37_col2\" class=\"data row37 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row37_col3\" class=\"data row37 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row37_col4\" class=\"data row37 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row37_col5\" class=\"data row37 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row37_col6\" class=\"data row37 col6\" >0.527474</td>\n",
       "      <td id=\"T_8e976_row37_col7\" class=\"data row37 col7\" >0.738899</td>\n",
       "      <td id=\"T_8e976_row37_col8\" class=\"data row37 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row37_col9\" class=\"data row37 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row37_col10\" class=\"data row37 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row37_col11\" class=\"data row37 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row38\" class=\"row_heading level0 row38\" >84</th>\n",
       "      <td id=\"T_8e976_row38_col0\" class=\"data row38 col0\" >52</td>\n",
       "      <td id=\"T_8e976_row38_col1\" class=\"data row38 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row38_col2\" class=\"data row38 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row38_col3\" class=\"data row38 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row38_col4\" class=\"data row38 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row38_col5\" class=\"data row38 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row38_col6\" class=\"data row38 col6\" >0.508179</td>\n",
       "      <td id=\"T_8e976_row38_col7\" class=\"data row38 col7\" >0.731794</td>\n",
       "      <td id=\"T_8e976_row38_col8\" class=\"data row38 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row38_col9\" class=\"data row38 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row38_col10\" class=\"data row38 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row38_col11\" class=\"data row38 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row39\" class=\"row_heading level0 row39\" >118</th>\n",
       "      <td id=\"T_8e976_row39_col0\" class=\"data row39 col0\" >86</td>\n",
       "      <td id=\"T_8e976_row39_col1\" class=\"data row39 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row39_col2\" class=\"data row39 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row39_col3\" class=\"data row39 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row39_col4\" class=\"data row39 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row39_col5\" class=\"data row39 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row39_col6\" class=\"data row39 col6\" >0.511033</td>\n",
       "      <td id=\"T_8e976_row39_col7\" class=\"data row39 col7\" >0.731794</td>\n",
       "      <td id=\"T_8e976_row39_col8\" class=\"data row39 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row39_col9\" class=\"data row39 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row39_col10\" class=\"data row39 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row39_col11\" class=\"data row39 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row40\" class=\"row_heading level0 row40\" >120</th>\n",
       "      <td id=\"T_8e976_row40_col0\" class=\"data row40 col0\" >88</td>\n",
       "      <td id=\"T_8e976_row40_col1\" class=\"data row40 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row40_col2\" class=\"data row40 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row40_col3\" class=\"data row40 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row40_col4\" class=\"data row40 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row40_col5\" class=\"data row40 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row40_col6\" class=\"data row40 col6\" >0.473721</td>\n",
       "      <td id=\"T_8e976_row40_col7\" class=\"data row40 col7\" >0.731794</td>\n",
       "      <td id=\"T_8e976_row40_col8\" class=\"data row40 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row40_col9\" class=\"data row40 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row40_col10\" class=\"data row40 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row40_col11\" class=\"data row40 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row41\" class=\"row_heading level0 row41\" >98</th>\n",
       "      <td id=\"T_8e976_row41_col0\" class=\"data row41 col0\" >66</td>\n",
       "      <td id=\"T_8e976_row41_col1\" class=\"data row41 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row41_col2\" class=\"data row41 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row41_col3\" class=\"data row41 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row41_col4\" class=\"data row41 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row41_col5\" class=\"data row41 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row41_col6\" class=\"data row41 col6\" >0.518997</td>\n",
       "      <td id=\"T_8e976_row41_col7\" class=\"data row41 col7\" >0.731794</td>\n",
       "      <td id=\"T_8e976_row41_col8\" class=\"data row41 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row41_col9\" class=\"data row41 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row41_col10\" class=\"data row41 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row41_col11\" class=\"data row41 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row42\" class=\"row_heading level0 row42\" >31</th>\n",
       "      <td id=\"T_8e976_row42_col0\" class=\"data row42 col0\" >31</td>\n",
       "      <td id=\"T_8e976_row42_col1\" class=\"data row42 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row42_col2\" class=\"data row42 col2\" >LeakyReLU</td>\n",
       "      <td id=\"T_8e976_row42_col3\" class=\"data row42 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row42_col4\" class=\"data row42 col4\" >256-512-1024</td>\n",
       "      <td id=\"T_8e976_row42_col5\" class=\"data row42 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row42_col6\" class=\"data row42 col6\" >0.538754</td>\n",
       "      <td id=\"T_8e976_row42_col7\" class=\"data row42 col7\" >0.731794</td>\n",
       "      <td id=\"T_8e976_row42_col8\" class=\"data row42 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row42_col9\" class=\"data row42 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row42_col10\" class=\"data row42 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row42_col11\" class=\"data row42 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row43\" class=\"row_heading level0 row43\" >40</th>\n",
       "      <td id=\"T_8e976_row43_col0\" class=\"data row43 col0\" >8</td>\n",
       "      <td id=\"T_8e976_row43_col1\" class=\"data row43 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row43_col2\" class=\"data row43 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row43_col3\" class=\"data row43 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row43_col4\" class=\"data row43 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row43_col5\" class=\"data row43 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row43_col6\" class=\"data row43 col6\" >0.546107</td>\n",
       "      <td id=\"T_8e976_row43_col7\" class=\"data row43 col7\" >0.731794</td>\n",
       "      <td id=\"T_8e976_row43_col8\" class=\"data row43 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row43_col9\" class=\"data row43 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row43_col10\" class=\"data row43 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row43_col11\" class=\"data row43 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row44\" class=\"row_heading level0 row44\" >4</th>\n",
       "      <td id=\"T_8e976_row44_col0\" class=\"data row44 col0\" >4</td>\n",
       "      <td id=\"T_8e976_row44_col1\" class=\"data row44 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row44_col2\" class=\"data row44 col2\" >ReLU</td>\n",
       "      <td id=\"T_8e976_row44_col3\" class=\"data row44 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row44_col4\" class=\"data row44 col4\" >32-64-128</td>\n",
       "      <td id=\"T_8e976_row44_col5\" class=\"data row44 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row44_col6\" class=\"data row44 col6\" >0.533960</td>\n",
       "      <td id=\"T_8e976_row44_col7\" class=\"data row44 col7\" >0.728242</td>\n",
       "      <td id=\"T_8e976_row44_col8\" class=\"data row44 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row44_col9\" class=\"data row44 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row44_col10\" class=\"data row44 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row44_col11\" class=\"data row44 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row45\" class=\"row_heading level0 row45\" >70</th>\n",
       "      <td id=\"T_8e976_row45_col0\" class=\"data row45 col0\" >38</td>\n",
       "      <td id=\"T_8e976_row45_col1\" class=\"data row45 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row45_col2\" class=\"data row45 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row45_col3\" class=\"data row45 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row45_col4\" class=\"data row45 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row45_col5\" class=\"data row45 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row45_col6\" class=\"data row45 col6\" >0.579302</td>\n",
       "      <td id=\"T_8e976_row45_col7\" class=\"data row45 col7\" >0.726465</td>\n",
       "      <td id=\"T_8e976_row45_col8\" class=\"data row45 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row45_col9\" class=\"data row45 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row45_col10\" class=\"data row45 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row45_col11\" class=\"data row45 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row46\" class=\"row_heading level0 row46\" >12</th>\n",
       "      <td id=\"T_8e976_row46_col0\" class=\"data row46 col0\" >12</td>\n",
       "      <td id=\"T_8e976_row46_col1\" class=\"data row46 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row46_col2\" class=\"data row46 col2\" >ReLU</td>\n",
       "      <td id=\"T_8e976_row46_col3\" class=\"data row46 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row46_col4\" class=\"data row46 col4\" >32-64-128</td>\n",
       "      <td id=\"T_8e976_row46_col5\" class=\"data row46 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row46_col6\" class=\"data row46 col6\" >0.508314</td>\n",
       "      <td id=\"T_8e976_row46_col7\" class=\"data row46 col7\" >0.724689</td>\n",
       "      <td id=\"T_8e976_row46_col8\" class=\"data row46 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row46_col9\" class=\"data row46 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row46_col10\" class=\"data row46 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row46_col11\" class=\"data row46 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row47\" class=\"row_heading level0 row47\" >38</th>\n",
       "      <td id=\"T_8e976_row47_col0\" class=\"data row47 col0\" >6</td>\n",
       "      <td id=\"T_8e976_row47_col1\" class=\"data row47 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row47_col2\" class=\"data row47 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row47_col3\" class=\"data row47 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row47_col4\" class=\"data row47 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row47_col5\" class=\"data row47 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row47_col6\" class=\"data row47 col6\" >0.538916</td>\n",
       "      <td id=\"T_8e976_row47_col7\" class=\"data row47 col7\" >0.721137</td>\n",
       "      <td id=\"T_8e976_row47_col8\" class=\"data row47 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row47_col9\" class=\"data row47 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row47_col10\" class=\"data row47 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row47_col11\" class=\"data row47 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row48\" class=\"row_heading level0 row48\" >82</th>\n",
       "      <td id=\"T_8e976_row48_col0\" class=\"data row48 col0\" >50</td>\n",
       "      <td id=\"T_8e976_row48_col1\" class=\"data row48 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row48_col2\" class=\"data row48 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row48_col3\" class=\"data row48 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row48_col4\" class=\"data row48 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row48_col5\" class=\"data row48 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row48_col6\" class=\"data row48 col6\" >0.537538</td>\n",
       "      <td id=\"T_8e976_row48_col7\" class=\"data row48 col7\" >0.721137</td>\n",
       "      <td id=\"T_8e976_row48_col8\" class=\"data row48 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row48_col9\" class=\"data row48 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row48_col10\" class=\"data row48 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row48_col11\" class=\"data row48 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row49\" class=\"row_heading level0 row49\" >122</th>\n",
       "      <td id=\"T_8e976_row49_col0\" class=\"data row49 col0\" >90</td>\n",
       "      <td id=\"T_8e976_row49_col1\" class=\"data row49 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row49_col2\" class=\"data row49 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row49_col3\" class=\"data row49 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row49_col4\" class=\"data row49 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row49_col5\" class=\"data row49 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row49_col6\" class=\"data row49 col6\" >0.522755</td>\n",
       "      <td id=\"T_8e976_row49_col7\" class=\"data row49 col7\" >0.719361</td>\n",
       "      <td id=\"T_8e976_row49_col8\" class=\"data row49 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row49_col9\" class=\"data row49 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row49_col10\" class=\"data row49 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row49_col11\" class=\"data row49 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row50\" class=\"row_heading level0 row50\" >48</th>\n",
       "      <td id=\"T_8e976_row50_col0\" class=\"data row50 col0\" >16</td>\n",
       "      <td id=\"T_8e976_row50_col1\" class=\"data row50 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row50_col2\" class=\"data row50 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row50_col3\" class=\"data row50 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row50_col4\" class=\"data row50 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row50_col5\" class=\"data row50 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row50_col6\" class=\"data row50 col6\" >0.471181</td>\n",
       "      <td id=\"T_8e976_row50_col7\" class=\"data row50 col7\" >0.715808</td>\n",
       "      <td id=\"T_8e976_row50_col8\" class=\"data row50 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row50_col9\" class=\"data row50 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row50_col10\" class=\"data row50 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row50_col11\" class=\"data row50 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row51\" class=\"row_heading level0 row51\" >32</th>\n",
       "      <td id=\"T_8e976_row51_col0\" class=\"data row51 col0\" >0</td>\n",
       "      <td id=\"T_8e976_row51_col1\" class=\"data row51 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row51_col2\" class=\"data row51 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row51_col3\" class=\"data row51 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row51_col4\" class=\"data row51 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row51_col5\" class=\"data row51 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row51_col6\" class=\"data row51 col6\" >0.550226</td>\n",
       "      <td id=\"T_8e976_row51_col7\" class=\"data row51 col7\" >0.714032</td>\n",
       "      <td id=\"T_8e976_row51_col8\" class=\"data row51 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row51_col9\" class=\"data row51 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row51_col10\" class=\"data row51 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row51_col11\" class=\"data row51 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row52\" class=\"row_heading level0 row52\" >108</th>\n",
       "      <td id=\"T_8e976_row52_col0\" class=\"data row52 col0\" >76</td>\n",
       "      <td id=\"T_8e976_row52_col1\" class=\"data row52 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row52_col2\" class=\"data row52 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row52_col3\" class=\"data row52 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row52_col4\" class=\"data row52 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row52_col5\" class=\"data row52 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row52_col6\" class=\"data row52 col6\" >0.523274</td>\n",
       "      <td id=\"T_8e976_row52_col7\" class=\"data row52 col7\" >0.714032</td>\n",
       "      <td id=\"T_8e976_row52_col8\" class=\"data row52 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row52_col9\" class=\"data row52 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row52_col10\" class=\"data row52 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row52_col11\" class=\"data row52 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row53\" class=\"row_heading level0 row53\" >63</th>\n",
       "      <td id=\"T_8e976_row53_col0\" class=\"data row53 col0\" >31</td>\n",
       "      <td id=\"T_8e976_row53_col1\" class=\"data row53 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row53_col2\" class=\"data row53 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row53_col3\" class=\"data row53 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row53_col4\" class=\"data row53 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row53_col5\" class=\"data row53 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row53_col6\" class=\"data row53 col6\" >0.591639</td>\n",
       "      <td id=\"T_8e976_row53_col7\" class=\"data row53 col7\" >0.712256</td>\n",
       "      <td id=\"T_8e976_row53_col8\" class=\"data row53 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row53_col9\" class=\"data row53 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row53_col10\" class=\"data row53 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row53_col11\" class=\"data row53 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row54\" class=\"row_heading level0 row54\" >104</th>\n",
       "      <td id=\"T_8e976_row54_col0\" class=\"data row54 col0\" >72</td>\n",
       "      <td id=\"T_8e976_row54_col1\" class=\"data row54 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row54_col2\" class=\"data row54 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row54_col3\" class=\"data row54 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row54_col4\" class=\"data row54 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row54_col5\" class=\"data row54 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row54_col6\" class=\"data row54 col6\" >0.508801</td>\n",
       "      <td id=\"T_8e976_row54_col7\" class=\"data row54 col7\" >0.710480</td>\n",
       "      <td id=\"T_8e976_row54_col8\" class=\"data row54 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row54_col9\" class=\"data row54 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row54_col10\" class=\"data row54 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row54_col11\" class=\"data row54 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row55\" class=\"row_heading level0 row55\" >24</th>\n",
       "      <td id=\"T_8e976_row55_col0\" class=\"data row55 col0\" >24</td>\n",
       "      <td id=\"T_8e976_row55_col1\" class=\"data row55 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row55_col2\" class=\"data row55 col2\" >LeakyReLU</td>\n",
       "      <td id=\"T_8e976_row55_col3\" class=\"data row55 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row55_col4\" class=\"data row55 col4\" >128-64-32</td>\n",
       "      <td id=\"T_8e976_row55_col5\" class=\"data row55 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row55_col6\" class=\"data row55 col6\" >0.519161</td>\n",
       "      <td id=\"T_8e976_row55_col7\" class=\"data row55 col7\" >0.710480</td>\n",
       "      <td id=\"T_8e976_row55_col8\" class=\"data row55 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row55_col9\" class=\"data row55 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row55_col10\" class=\"data row55 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row55_col11\" class=\"data row55 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row56\" class=\"row_heading level0 row56\" >68</th>\n",
       "      <td id=\"T_8e976_row56_col0\" class=\"data row56 col0\" >36</td>\n",
       "      <td id=\"T_8e976_row56_col1\" class=\"data row56 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row56_col2\" class=\"data row56 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row56_col3\" class=\"data row56 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row56_col4\" class=\"data row56 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row56_col5\" class=\"data row56 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row56_col6\" class=\"data row56 col6\" >0.572089</td>\n",
       "      <td id=\"T_8e976_row56_col7\" class=\"data row56 col7\" >0.706927</td>\n",
       "      <td id=\"T_8e976_row56_col8\" class=\"data row56 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row56_col9\" class=\"data row56 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row56_col10\" class=\"data row56 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row56_col11\" class=\"data row56 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row57\" class=\"row_heading level0 row57\" >92</th>\n",
       "      <td id=\"T_8e976_row57_col0\" class=\"data row57 col0\" >60</td>\n",
       "      <td id=\"T_8e976_row57_col1\" class=\"data row57 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row57_col2\" class=\"data row57 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row57_col3\" class=\"data row57 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row57_col4\" class=\"data row57 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row57_col5\" class=\"data row57 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row57_col6\" class=\"data row57 col6\" >0.575852</td>\n",
       "      <td id=\"T_8e976_row57_col7\" class=\"data row57 col7\" >0.706927</td>\n",
       "      <td id=\"T_8e976_row57_col8\" class=\"data row57 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row57_col9\" class=\"data row57 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row57_col10\" class=\"data row57 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row57_col11\" class=\"data row57 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row58\" class=\"row_heading level0 row58\" >126</th>\n",
       "      <td id=\"T_8e976_row58_col0\" class=\"data row58 col0\" >94</td>\n",
       "      <td id=\"T_8e976_row58_col1\" class=\"data row58 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row58_col2\" class=\"data row58 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row58_col3\" class=\"data row58 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row58_col4\" class=\"data row58 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row58_col5\" class=\"data row58 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row58_col6\" class=\"data row58 col6\" >0.570989</td>\n",
       "      <td id=\"T_8e976_row58_col7\" class=\"data row58 col7\" >0.706927</td>\n",
       "      <td id=\"T_8e976_row58_col8\" class=\"data row58 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row58_col9\" class=\"data row58 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row58_col10\" class=\"data row58 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row58_col11\" class=\"data row58 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row59\" class=\"row_heading level0 row59\" >44</th>\n",
       "      <td id=\"T_8e976_row59_col0\" class=\"data row59 col0\" >12</td>\n",
       "      <td id=\"T_8e976_row59_col1\" class=\"data row59 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row59_col2\" class=\"data row59 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row59_col3\" class=\"data row59 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row59_col4\" class=\"data row59 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row59_col5\" class=\"data row59 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row59_col6\" class=\"data row59 col6\" >0.577062</td>\n",
       "      <td id=\"T_8e976_row59_col7\" class=\"data row59 col7\" >0.705151</td>\n",
       "      <td id=\"T_8e976_row59_col8\" class=\"data row59 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row59_col9\" class=\"data row59 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row59_col10\" class=\"data row59 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row59_col11\" class=\"data row59 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row60\" class=\"row_heading level0 row60\" >79</th>\n",
       "      <td id=\"T_8e976_row60_col0\" class=\"data row60 col0\" >47</td>\n",
       "      <td id=\"T_8e976_row60_col1\" class=\"data row60 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row60_col2\" class=\"data row60 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row60_col3\" class=\"data row60 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row60_col4\" class=\"data row60 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row60_col5\" class=\"data row60 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row60_col6\" class=\"data row60 col6\" >0.557485</td>\n",
       "      <td id=\"T_8e976_row60_col7\" class=\"data row60 col7\" >0.703375</td>\n",
       "      <td id=\"T_8e976_row60_col8\" class=\"data row60 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row60_col9\" class=\"data row60 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row60_col10\" class=\"data row60 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row60_col11\" class=\"data row60 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row61\" class=\"row_heading level0 row61\" >51</th>\n",
       "      <td id=\"T_8e976_row61_col0\" class=\"data row61 col0\" >19</td>\n",
       "      <td id=\"T_8e976_row61_col1\" class=\"data row61 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row61_col2\" class=\"data row61 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row61_col3\" class=\"data row61 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row61_col4\" class=\"data row61 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row61_col5\" class=\"data row61 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row61_col6\" class=\"data row61 col6\" >0.550771</td>\n",
       "      <td id=\"T_8e976_row61_col7\" class=\"data row61 col7\" >0.703375</td>\n",
       "      <td id=\"T_8e976_row61_col8\" class=\"data row61 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row61_col9\" class=\"data row61 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row61_col10\" class=\"data row61 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row61_col11\" class=\"data row61 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row62\" class=\"row_heading level0 row62\" >11</th>\n",
       "      <td id=\"T_8e976_row62_col0\" class=\"data row62 col0\" >11</td>\n",
       "      <td id=\"T_8e976_row62_col1\" class=\"data row62 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row62_col2\" class=\"data row62 col2\" >ReLU</td>\n",
       "      <td id=\"T_8e976_row62_col3\" class=\"data row62 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row62_col4\" class=\"data row62 col4\" >1024-512-256</td>\n",
       "      <td id=\"T_8e976_row62_col5\" class=\"data row62 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row62_col6\" class=\"data row62 col6\" >0.555238</td>\n",
       "      <td id=\"T_8e976_row62_col7\" class=\"data row62 col7\" >0.703375</td>\n",
       "      <td id=\"T_8e976_row62_col8\" class=\"data row62 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row62_col9\" class=\"data row62 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row62_col10\" class=\"data row62 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row62_col11\" class=\"data row62 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row63\" class=\"row_heading level0 row63\" >114</th>\n",
       "      <td id=\"T_8e976_row63_col0\" class=\"data row63 col0\" >82</td>\n",
       "      <td id=\"T_8e976_row63_col1\" class=\"data row63 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row63_col2\" class=\"data row63 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row63_col3\" class=\"data row63 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row63_col4\" class=\"data row63 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row63_col5\" class=\"data row63 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row63_col6\" class=\"data row63 col6\" >0.487832</td>\n",
       "      <td id=\"T_8e976_row63_col7\" class=\"data row63 col7\" >0.701599</td>\n",
       "      <td id=\"T_8e976_row63_col8\" class=\"data row63 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row63_col9\" class=\"data row63 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row63_col10\" class=\"data row63 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row63_col11\" class=\"data row63 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row64\" class=\"row_heading level0 row64\" >90</th>\n",
       "      <td id=\"T_8e976_row64_col0\" class=\"data row64 col0\" >58</td>\n",
       "      <td id=\"T_8e976_row64_col1\" class=\"data row64 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row64_col2\" class=\"data row64 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row64_col3\" class=\"data row64 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row64_col4\" class=\"data row64 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row64_col5\" class=\"data row64 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row64_col6\" class=\"data row64 col6\" >0.579668</td>\n",
       "      <td id=\"T_8e976_row64_col7\" class=\"data row64 col7\" >0.699822</td>\n",
       "      <td id=\"T_8e976_row64_col8\" class=\"data row64 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row64_col9\" class=\"data row64 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row64_col10\" class=\"data row64 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row64_col11\" class=\"data row64 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row65\" class=\"row_heading level0 row65\" >3</th>\n",
       "      <td id=\"T_8e976_row65_col0\" class=\"data row65 col0\" >3</td>\n",
       "      <td id=\"T_8e976_row65_col1\" class=\"data row65 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row65_col2\" class=\"data row65 col2\" >ReLU</td>\n",
       "      <td id=\"T_8e976_row65_col3\" class=\"data row65 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row65_col4\" class=\"data row65 col4\" >1024-512-256</td>\n",
       "      <td id=\"T_8e976_row65_col5\" class=\"data row65 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row65_col6\" class=\"data row65 col6\" >0.572410</td>\n",
       "      <td id=\"T_8e976_row65_col7\" class=\"data row65 col7\" >0.696270</td>\n",
       "      <td id=\"T_8e976_row65_col8\" class=\"data row65 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row65_col9\" class=\"data row65 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row65_col10\" class=\"data row65 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row65_col11\" class=\"data row65 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row66\" class=\"row_heading level0 row66\" >112</th>\n",
       "      <td id=\"T_8e976_row66_col0\" class=\"data row66 col0\" >80</td>\n",
       "      <td id=\"T_8e976_row66_col1\" class=\"data row66 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row66_col2\" class=\"data row66 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row66_col3\" class=\"data row66 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row66_col4\" class=\"data row66 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row66_col5\" class=\"data row66 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row66_col6\" class=\"data row66 col6\" >0.553881</td>\n",
       "      <td id=\"T_8e976_row66_col7\" class=\"data row66 col7\" >0.692718</td>\n",
       "      <td id=\"T_8e976_row66_col8\" class=\"data row66 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row66_col9\" class=\"data row66 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row66_col10\" class=\"data row66 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row66_col11\" class=\"data row66 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row67\" class=\"row_heading level0 row67\" >15</th>\n",
       "      <td id=\"T_8e976_row67_col0\" class=\"data row67 col0\" >15</td>\n",
       "      <td id=\"T_8e976_row67_col1\" class=\"data row67 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row67_col2\" class=\"data row67 col2\" >ReLU</td>\n",
       "      <td id=\"T_8e976_row67_col3\" class=\"data row67 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row67_col4\" class=\"data row67 col4\" >256-512-1024</td>\n",
       "      <td id=\"T_8e976_row67_col5\" class=\"data row67 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row67_col6\" class=\"data row67 col6\" >0.562511</td>\n",
       "      <td id=\"T_8e976_row67_col7\" class=\"data row67 col7\" >0.685613</td>\n",
       "      <td id=\"T_8e976_row67_col8\" class=\"data row67 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row67_col9\" class=\"data row67 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row67_col10\" class=\"data row67 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row67_col11\" class=\"data row67 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row68\" class=\"row_heading level0 row68\" >35</th>\n",
       "      <td id=\"T_8e976_row68_col0\" class=\"data row68 col0\" >3</td>\n",
       "      <td id=\"T_8e976_row68_col1\" class=\"data row68 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row68_col2\" class=\"data row68 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row68_col3\" class=\"data row68 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row68_col4\" class=\"data row68 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row68_col5\" class=\"data row68 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row68_col6\" class=\"data row68 col6\" >0.581403</td>\n",
       "      <td id=\"T_8e976_row68_col7\" class=\"data row68 col7\" >0.685613</td>\n",
       "      <td id=\"T_8e976_row68_col8\" class=\"data row68 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row68_col9\" class=\"data row68 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row68_col10\" class=\"data row68 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row68_col11\" class=\"data row68 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row69\" class=\"row_heading level0 row69\" >45</th>\n",
       "      <td id=\"T_8e976_row69_col0\" class=\"data row69 col0\" >13</td>\n",
       "      <td id=\"T_8e976_row69_col1\" class=\"data row69 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row69_col2\" class=\"data row69 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row69_col3\" class=\"data row69 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row69_col4\" class=\"data row69 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row69_col5\" class=\"data row69 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row69_col6\" class=\"data row69 col6\" >0.597738</td>\n",
       "      <td id=\"T_8e976_row69_col7\" class=\"data row69 col7\" >0.685613</td>\n",
       "      <td id=\"T_8e976_row69_col8\" class=\"data row69 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row69_col9\" class=\"data row69 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row69_col10\" class=\"data row69 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row69_col11\" class=\"data row69 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row70\" class=\"row_heading level0 row70\" >100</th>\n",
       "      <td id=\"T_8e976_row70_col0\" class=\"data row70 col0\" >68</td>\n",
       "      <td id=\"T_8e976_row70_col1\" class=\"data row70 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row70_col2\" class=\"data row70 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row70_col3\" class=\"data row70 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row70_col4\" class=\"data row70 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row70_col5\" class=\"data row70 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row70_col6\" class=\"data row70 col6\" >0.584579</td>\n",
       "      <td id=\"T_8e976_row70_col7\" class=\"data row70 col7\" >0.683837</td>\n",
       "      <td id=\"T_8e976_row70_col8\" class=\"data row70 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row70_col9\" class=\"data row70 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row70_col10\" class=\"data row70 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row70_col11\" class=\"data row70 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row71\" class=\"row_heading level0 row71\" >83</th>\n",
       "      <td id=\"T_8e976_row71_col0\" class=\"data row71 col0\" >51</td>\n",
       "      <td id=\"T_8e976_row71_col1\" class=\"data row71 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row71_col2\" class=\"data row71 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row71_col3\" class=\"data row71 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row71_col4\" class=\"data row71 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row71_col5\" class=\"data row71 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row71_col6\" class=\"data row71 col6\" >0.662541</td>\n",
       "      <td id=\"T_8e976_row71_col7\" class=\"data row71 col7\" >0.680284</td>\n",
       "      <td id=\"T_8e976_row71_col8\" class=\"data row71 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row71_col9\" class=\"data row71 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row71_col10\" class=\"data row71 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row71_col11\" class=\"data row71 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row72\" class=\"row_heading level0 row72\" >127</th>\n",
       "      <td id=\"T_8e976_row72_col0\" class=\"data row72 col0\" >95</td>\n",
       "      <td id=\"T_8e976_row72_col1\" class=\"data row72 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row72_col2\" class=\"data row72 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row72_col3\" class=\"data row72 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row72_col4\" class=\"data row72 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row72_col5\" class=\"data row72 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row72_col6\" class=\"data row72 col6\" >0.614641</td>\n",
       "      <td id=\"T_8e976_row72_col7\" class=\"data row72 col7\" >0.676732</td>\n",
       "      <td id=\"T_8e976_row72_col8\" class=\"data row72 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row72_col9\" class=\"data row72 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row72_col10\" class=\"data row72 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row72_col11\" class=\"data row72 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row73\" class=\"row_heading level0 row73\" >101</th>\n",
       "      <td id=\"T_8e976_row73_col0\" class=\"data row73 col0\" >69</td>\n",
       "      <td id=\"T_8e976_row73_col1\" class=\"data row73 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row73_col2\" class=\"data row73 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row73_col3\" class=\"data row73 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row73_col4\" class=\"data row73 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row73_col5\" class=\"data row73 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row73_col6\" class=\"data row73 col6\" >0.579955</td>\n",
       "      <td id=\"T_8e976_row73_col7\" class=\"data row73 col7\" >0.674956</td>\n",
       "      <td id=\"T_8e976_row73_col8\" class=\"data row73 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row73_col9\" class=\"data row73 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row73_col10\" class=\"data row73 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row73_col11\" class=\"data row73 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row74\" class=\"row_heading level0 row74\" >102</th>\n",
       "      <td id=\"T_8e976_row74_col0\" class=\"data row74 col0\" >70</td>\n",
       "      <td id=\"T_8e976_row74_col1\" class=\"data row74 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row74_col2\" class=\"data row74 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row74_col3\" class=\"data row74 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row74_col4\" class=\"data row74 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row74_col5\" class=\"data row74 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row74_col6\" class=\"data row74 col6\" >0.585392</td>\n",
       "      <td id=\"T_8e976_row74_col7\" class=\"data row74 col7\" >0.671403</td>\n",
       "      <td id=\"T_8e976_row74_col8\" class=\"data row74 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row74_col9\" class=\"data row74 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row74_col10\" class=\"data row74 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row74_col11\" class=\"data row74 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row75\" class=\"row_heading level0 row75\" >27</th>\n",
       "      <td id=\"T_8e976_row75_col0\" class=\"data row75 col0\" >27</td>\n",
       "      <td id=\"T_8e976_row75_col1\" class=\"data row75 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row75_col2\" class=\"data row75 col2\" >LeakyReLU</td>\n",
       "      <td id=\"T_8e976_row75_col3\" class=\"data row75 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row75_col4\" class=\"data row75 col4\" >1024-512-256</td>\n",
       "      <td id=\"T_8e976_row75_col5\" class=\"data row75 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row75_col6\" class=\"data row75 col6\" >0.594700</td>\n",
       "      <td id=\"T_8e976_row75_col7\" class=\"data row75 col7\" >0.667851</td>\n",
       "      <td id=\"T_8e976_row75_col8\" class=\"data row75 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row75_col9\" class=\"data row75 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row75_col10\" class=\"data row75 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row75_col11\" class=\"data row75 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row76\" class=\"row_heading level0 row76\" >7</th>\n",
       "      <td id=\"T_8e976_row76_col0\" class=\"data row76 col0\" >7</td>\n",
       "      <td id=\"T_8e976_row76_col1\" class=\"data row76 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row76_col2\" class=\"data row76 col2\" >ReLU</td>\n",
       "      <td id=\"T_8e976_row76_col3\" class=\"data row76 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row76_col4\" class=\"data row76 col4\" >256-512-1024</td>\n",
       "      <td id=\"T_8e976_row76_col5\" class=\"data row76 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row76_col6\" class=\"data row76 col6\" >0.598617</td>\n",
       "      <td id=\"T_8e976_row76_col7\" class=\"data row76 col7\" >0.666075</td>\n",
       "      <td id=\"T_8e976_row76_col8\" class=\"data row76 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row76_col9\" class=\"data row76 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row76_col10\" class=\"data row76 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row76_col11\" class=\"data row76 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row77\" class=\"row_heading level0 row77\" >121</th>\n",
       "      <td id=\"T_8e976_row77_col0\" class=\"data row77 col0\" >89</td>\n",
       "      <td id=\"T_8e976_row77_col1\" class=\"data row77 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row77_col2\" class=\"data row77 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row77_col3\" class=\"data row77 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row77_col4\" class=\"data row77 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row77_col5\" class=\"data row77 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row77_col6\" class=\"data row77 col6\" >0.632152</td>\n",
       "      <td id=\"T_8e976_row77_col7\" class=\"data row77 col7\" >0.666075</td>\n",
       "      <td id=\"T_8e976_row77_col8\" class=\"data row77 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row77_col9\" class=\"data row77 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row77_col10\" class=\"data row77 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row77_col11\" class=\"data row77 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row78\" class=\"row_heading level0 row78\" >76</th>\n",
       "      <td id=\"T_8e976_row78_col0\" class=\"data row78 col0\" >44</td>\n",
       "      <td id=\"T_8e976_row78_col1\" class=\"data row78 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row78_col2\" class=\"data row78 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row78_col3\" class=\"data row78 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row78_col4\" class=\"data row78 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row78_col5\" class=\"data row78 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row78_col6\" class=\"data row78 col6\" >0.641684</td>\n",
       "      <td id=\"T_8e976_row78_col7\" class=\"data row78 col7\" >0.666075</td>\n",
       "      <td id=\"T_8e976_row78_col8\" class=\"data row78 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row78_col9\" class=\"data row78 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row78_col10\" class=\"data row78 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row78_col11\" class=\"data row78 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row79\" class=\"row_heading level0 row79\" >103</th>\n",
       "      <td id=\"T_8e976_row79_col0\" class=\"data row79 col0\" >71</td>\n",
       "      <td id=\"T_8e976_row79_col1\" class=\"data row79 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row79_col2\" class=\"data row79 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row79_col3\" class=\"data row79 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row79_col4\" class=\"data row79 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row79_col5\" class=\"data row79 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row79_col6\" class=\"data row79 col6\" >0.616750</td>\n",
       "      <td id=\"T_8e976_row79_col7\" class=\"data row79 col7\" >0.666075</td>\n",
       "      <td id=\"T_8e976_row79_col8\" class=\"data row79 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row79_col9\" class=\"data row79 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row79_col10\" class=\"data row79 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row79_col11\" class=\"data row79 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row80\" class=\"row_heading level0 row80\" >91</th>\n",
       "      <td id=\"T_8e976_row80_col0\" class=\"data row80 col0\" >59</td>\n",
       "      <td id=\"T_8e976_row80_col1\" class=\"data row80 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row80_col2\" class=\"data row80 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row80_col3\" class=\"data row80 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row80_col4\" class=\"data row80 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row80_col5\" class=\"data row80 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row80_col6\" class=\"data row80 col6\" >0.634522</td>\n",
       "      <td id=\"T_8e976_row80_col7\" class=\"data row80 col7\" >0.664298</td>\n",
       "      <td id=\"T_8e976_row80_col8\" class=\"data row80 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row80_col9\" class=\"data row80 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row80_col10\" class=\"data row80 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row80_col11\" class=\"data row80 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row81\" class=\"row_heading level0 row81\" >59</th>\n",
       "      <td id=\"T_8e976_row81_col0\" class=\"data row81 col0\" >27</td>\n",
       "      <td id=\"T_8e976_row81_col1\" class=\"data row81 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row81_col2\" class=\"data row81 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row81_col3\" class=\"data row81 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row81_col4\" class=\"data row81 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row81_col5\" class=\"data row81 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row81_col6\" class=\"data row81 col6\" >0.624750</td>\n",
       "      <td id=\"T_8e976_row81_col7\" class=\"data row81 col7\" >0.664298</td>\n",
       "      <td id=\"T_8e976_row81_col8\" class=\"data row81 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row81_col9\" class=\"data row81 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row81_col10\" class=\"data row81 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row81_col11\" class=\"data row81 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row82\" class=\"row_heading level0 row82\" >107</th>\n",
       "      <td id=\"T_8e976_row82_col0\" class=\"data row82 col0\" >75</td>\n",
       "      <td id=\"T_8e976_row82_col1\" class=\"data row82 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row82_col2\" class=\"data row82 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row82_col3\" class=\"data row82 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row82_col4\" class=\"data row82 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row82_col5\" class=\"data row82 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row82_col6\" class=\"data row82 col6\" >0.637458</td>\n",
       "      <td id=\"T_8e976_row82_col7\" class=\"data row82 col7\" >0.657194</td>\n",
       "      <td id=\"T_8e976_row82_col8\" class=\"data row82 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row82_col9\" class=\"data row82 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row82_col10\" class=\"data row82 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row82_col11\" class=\"data row82 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row83\" class=\"row_heading level0 row83\" >69</th>\n",
       "      <td id=\"T_8e976_row83_col0\" class=\"data row83 col0\" >37</td>\n",
       "      <td id=\"T_8e976_row83_col1\" class=\"data row83 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row83_col2\" class=\"data row83 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row83_col3\" class=\"data row83 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row83_col4\" class=\"data row83 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row83_col5\" class=\"data row83 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row83_col6\" class=\"data row83 col6\" >0.636728</td>\n",
       "      <td id=\"T_8e976_row83_col7\" class=\"data row83 col7\" >0.657194</td>\n",
       "      <td id=\"T_8e976_row83_col8\" class=\"data row83 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row83_col9\" class=\"data row83 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row83_col10\" class=\"data row83 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row83_col11\" class=\"data row83 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row84\" class=\"row_heading level0 row84\" >55</th>\n",
       "      <td id=\"T_8e976_row84_col0\" class=\"data row84 col0\" >23</td>\n",
       "      <td id=\"T_8e976_row84_col1\" class=\"data row84 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row84_col2\" class=\"data row84 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row84_col3\" class=\"data row84 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row84_col4\" class=\"data row84 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row84_col5\" class=\"data row84 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row84_col6\" class=\"data row84 col6\" >0.613378</td>\n",
       "      <td id=\"T_8e976_row84_col7\" class=\"data row84 col7\" >0.657194</td>\n",
       "      <td id=\"T_8e976_row84_col8\" class=\"data row84 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row84_col9\" class=\"data row84 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row84_col10\" class=\"data row84 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row84_col11\" class=\"data row84 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row85\" class=\"row_heading level0 row85\" >5</th>\n",
       "      <td id=\"T_8e976_row85_col0\" class=\"data row85 col0\" >5</td>\n",
       "      <td id=\"T_8e976_row85_col1\" class=\"data row85 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row85_col2\" class=\"data row85 col2\" >ReLU</td>\n",
       "      <td id=\"T_8e976_row85_col3\" class=\"data row85 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row85_col4\" class=\"data row85 col4\" >32-64-128</td>\n",
       "      <td id=\"T_8e976_row85_col5\" class=\"data row85 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row85_col6\" class=\"data row85 col6\" >0.670955</td>\n",
       "      <td id=\"T_8e976_row85_col7\" class=\"data row85 col7\" >0.655417</td>\n",
       "      <td id=\"T_8e976_row85_col8\" class=\"data row85 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row85_col9\" class=\"data row85 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row85_col10\" class=\"data row85 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row85_col11\" class=\"data row85 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row86\" class=\"row_heading level0 row86\" >117</th>\n",
       "      <td id=\"T_8e976_row86_col0\" class=\"data row86 col0\" >85</td>\n",
       "      <td id=\"T_8e976_row86_col1\" class=\"data row86 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row86_col2\" class=\"data row86 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row86_col3\" class=\"data row86 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row86_col4\" class=\"data row86 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row86_col5\" class=\"data row86 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row86_col6\" class=\"data row86 col6\" >0.629454</td>\n",
       "      <td id=\"T_8e976_row86_col7\" class=\"data row86 col7\" >0.655417</td>\n",
       "      <td id=\"T_8e976_row86_col8\" class=\"data row86 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row86_col9\" class=\"data row86 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row86_col10\" class=\"data row86 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row86_col11\" class=\"data row86 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row87\" class=\"row_heading level0 row87\" >97</th>\n",
       "      <td id=\"T_8e976_row87_col0\" class=\"data row87 col0\" >65</td>\n",
       "      <td id=\"T_8e976_row87_col1\" class=\"data row87 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row87_col2\" class=\"data row87 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row87_col3\" class=\"data row87 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row87_col4\" class=\"data row87 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row87_col5\" class=\"data row87 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row87_col6\" class=\"data row87 col6\" >0.645757</td>\n",
       "      <td id=\"T_8e976_row87_col7\" class=\"data row87 col7\" >0.655417</td>\n",
       "      <td id=\"T_8e976_row87_col8\" class=\"data row87 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row87_col9\" class=\"data row87 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row87_col10\" class=\"data row87 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row87_col11\" class=\"data row87 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row88\" class=\"row_heading level0 row88\" >87</th>\n",
       "      <td id=\"T_8e976_row88_col0\" class=\"data row88 col0\" >55</td>\n",
       "      <td id=\"T_8e976_row88_col1\" class=\"data row88 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row88_col2\" class=\"data row88 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row88_col3\" class=\"data row88 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row88_col4\" class=\"data row88 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row88_col5\" class=\"data row88 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row88_col6\" class=\"data row88 col6\" >0.646177</td>\n",
       "      <td id=\"T_8e976_row88_col7\" class=\"data row88 col7\" >0.651865</td>\n",
       "      <td id=\"T_8e976_row88_col8\" class=\"data row88 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row88_col9\" class=\"data row88 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row88_col10\" class=\"data row88 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row88_col11\" class=\"data row88 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row89\" class=\"row_heading level0 row89\" >23</th>\n",
       "      <td id=\"T_8e976_row89_col0\" class=\"data row89 col0\" >23</td>\n",
       "      <td id=\"T_8e976_row89_col1\" class=\"data row89 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row89_col2\" class=\"data row89 col2\" >LeakyReLU</td>\n",
       "      <td id=\"T_8e976_row89_col3\" class=\"data row89 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row89_col4\" class=\"data row89 col4\" >256-512-1024</td>\n",
       "      <td id=\"T_8e976_row89_col5\" class=\"data row89 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row89_col6\" class=\"data row89 col6\" >0.639443</td>\n",
       "      <td id=\"T_8e976_row89_col7\" class=\"data row89 col7\" >0.650089</td>\n",
       "      <td id=\"T_8e976_row89_col8\" class=\"data row89 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row89_col9\" class=\"data row89 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row89_col10\" class=\"data row89 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row89_col11\" class=\"data row89 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row90\" class=\"row_heading level0 row90\" >39</th>\n",
       "      <td id=\"T_8e976_row90_col0\" class=\"data row90 col0\" >7</td>\n",
       "      <td id=\"T_8e976_row90_col1\" class=\"data row90 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row90_col2\" class=\"data row90 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row90_col3\" class=\"data row90 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row90_col4\" class=\"data row90 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row90_col5\" class=\"data row90 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row90_col6\" class=\"data row90 col6\" >0.651099</td>\n",
       "      <td id=\"T_8e976_row90_col7\" class=\"data row90 col7\" >0.646536</td>\n",
       "      <td id=\"T_8e976_row90_col8\" class=\"data row90 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row90_col9\" class=\"data row90 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row90_col10\" class=\"data row90 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row90_col11\" class=\"data row90 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row91\" class=\"row_heading level0 row91\" >125</th>\n",
       "      <td id=\"T_8e976_row91_col0\" class=\"data row91 col0\" >93</td>\n",
       "      <td id=\"T_8e976_row91_col1\" class=\"data row91 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row91_col2\" class=\"data row91 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row91_col3\" class=\"data row91 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row91_col4\" class=\"data row91 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row91_col5\" class=\"data row91 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row91_col6\" class=\"data row91 col6\" >0.624359</td>\n",
       "      <td id=\"T_8e976_row91_col7\" class=\"data row91 col7\" >0.646536</td>\n",
       "      <td id=\"T_8e976_row91_col8\" class=\"data row91 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row91_col9\" class=\"data row91 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row91_col10\" class=\"data row91 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row91_col11\" class=\"data row91 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row92\" class=\"row_heading level0 row92\" >65</th>\n",
       "      <td id=\"T_8e976_row92_col0\" class=\"data row92 col0\" >33</td>\n",
       "      <td id=\"T_8e976_row92_col1\" class=\"data row92 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row92_col2\" class=\"data row92 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row92_col3\" class=\"data row92 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row92_col4\" class=\"data row92 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row92_col5\" class=\"data row92 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row92_col6\" class=\"data row92 col6\" >0.597288</td>\n",
       "      <td id=\"T_8e976_row92_col7\" class=\"data row92 col7\" >0.644760</td>\n",
       "      <td id=\"T_8e976_row92_col8\" class=\"data row92 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row92_col9\" class=\"data row92 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row92_col10\" class=\"data row92 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row92_col11\" class=\"data row92 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row93\" class=\"row_heading level0 row93\" >47</th>\n",
       "      <td id=\"T_8e976_row93_col0\" class=\"data row93 col0\" >15</td>\n",
       "      <td id=\"T_8e976_row93_col1\" class=\"data row93 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row93_col2\" class=\"data row93 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row93_col3\" class=\"data row93 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row93_col4\" class=\"data row93 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row93_col5\" class=\"data row93 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row93_col6\" class=\"data row93 col6\" >0.666151</td>\n",
       "      <td id=\"T_8e976_row93_col7\" class=\"data row93 col7\" >0.644760</td>\n",
       "      <td id=\"T_8e976_row93_col8\" class=\"data row93 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row93_col9\" class=\"data row93 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row93_col10\" class=\"data row93 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row93_col11\" class=\"data row93 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row94\" class=\"row_heading level0 row94\" >49</th>\n",
       "      <td id=\"T_8e976_row94_col0\" class=\"data row94 col0\" >17</td>\n",
       "      <td id=\"T_8e976_row94_col1\" class=\"data row94 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row94_col2\" class=\"data row94 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row94_col3\" class=\"data row94 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row94_col4\" class=\"data row94 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row94_col5\" class=\"data row94 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row94_col6\" class=\"data row94 col6\" >0.639839</td>\n",
       "      <td id=\"T_8e976_row94_col7\" class=\"data row94 col7\" >0.642984</td>\n",
       "      <td id=\"T_8e976_row94_col8\" class=\"data row94 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row94_col9\" class=\"data row94 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row94_col10\" class=\"data row94 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row94_col11\" class=\"data row94 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row95\" class=\"row_heading level0 row95\" >123</th>\n",
       "      <td id=\"T_8e976_row95_col0\" class=\"data row95 col0\" >91</td>\n",
       "      <td id=\"T_8e976_row95_col1\" class=\"data row95 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row95_col2\" class=\"data row95 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row95_col3\" class=\"data row95 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row95_col4\" class=\"data row95 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row95_col5\" class=\"data row95 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row95_col6\" class=\"data row95 col6\" >0.628552</td>\n",
       "      <td id=\"T_8e976_row95_col7\" class=\"data row95 col7\" >0.641208</td>\n",
       "      <td id=\"T_8e976_row95_col8\" class=\"data row95 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row95_col9\" class=\"data row95 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row95_col10\" class=\"data row95 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row95_col11\" class=\"data row95 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row96\" class=\"row_heading level0 row96\" >75</th>\n",
       "      <td id=\"T_8e976_row96_col0\" class=\"data row96 col0\" >43</td>\n",
       "      <td id=\"T_8e976_row96_col1\" class=\"data row96 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row96_col2\" class=\"data row96 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row96_col3\" class=\"data row96 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row96_col4\" class=\"data row96 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row96_col5\" class=\"data row96 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row96_col6\" class=\"data row96 col6\" >0.619922</td>\n",
       "      <td id=\"T_8e976_row96_col7\" class=\"data row96 col7\" >0.641208</td>\n",
       "      <td id=\"T_8e976_row96_col8\" class=\"data row96 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row96_col9\" class=\"data row96 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row96_col10\" class=\"data row96 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row96_col11\" class=\"data row96 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row97\" class=\"row_heading level0 row97\" >85</th>\n",
       "      <td id=\"T_8e976_row97_col0\" class=\"data row97 col0\" >53</td>\n",
       "      <td id=\"T_8e976_row97_col1\" class=\"data row97 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row97_col2\" class=\"data row97 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row97_col3\" class=\"data row97 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row97_col4\" class=\"data row97 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row97_col5\" class=\"data row97 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row97_col6\" class=\"data row97 col6\" >0.655388</td>\n",
       "      <td id=\"T_8e976_row97_col7\" class=\"data row97 col7\" >0.641208</td>\n",
       "      <td id=\"T_8e976_row97_col8\" class=\"data row97 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row97_col9\" class=\"data row97 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row97_col10\" class=\"data row97 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row97_col11\" class=\"data row97 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row98\" class=\"row_heading level0 row98\" >89</th>\n",
       "      <td id=\"T_8e976_row98_col0\" class=\"data row98 col0\" >57</td>\n",
       "      <td id=\"T_8e976_row98_col1\" class=\"data row98 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row98_col2\" class=\"data row98 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row98_col3\" class=\"data row98 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row98_col4\" class=\"data row98 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row98_col5\" class=\"data row98 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row98_col6\" class=\"data row98 col6\" >0.635567</td>\n",
       "      <td id=\"T_8e976_row98_col7\" class=\"data row98 col7\" >0.637655</td>\n",
       "      <td id=\"T_8e976_row98_col8\" class=\"data row98 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row98_col9\" class=\"data row98 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row98_col10\" class=\"data row98 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row98_col11\" class=\"data row98 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row99\" class=\"row_heading level0 row99\" >93</th>\n",
       "      <td id=\"T_8e976_row99_col0\" class=\"data row99 col0\" >61</td>\n",
       "      <td id=\"T_8e976_row99_col1\" class=\"data row99 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row99_col2\" class=\"data row99 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row99_col3\" class=\"data row99 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row99_col4\" class=\"data row99 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row99_col5\" class=\"data row99 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row99_col6\" class=\"data row99 col6\" >0.658716</td>\n",
       "      <td id=\"T_8e976_row99_col7\" class=\"data row99 col7\" >0.635879</td>\n",
       "      <td id=\"T_8e976_row99_col8\" class=\"data row99 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row99_col9\" class=\"data row99 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row99_col10\" class=\"data row99 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row99_col11\" class=\"data row99 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row100\" class=\"row_heading level0 row100\" >71</th>\n",
       "      <td id=\"T_8e976_row100_col0\" class=\"data row100 col0\" >39</td>\n",
       "      <td id=\"T_8e976_row100_col1\" class=\"data row100 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row100_col2\" class=\"data row100 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row100_col3\" class=\"data row100 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row100_col4\" class=\"data row100 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row100_col5\" class=\"data row100 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row100_col6\" class=\"data row100 col6\" >0.646253</td>\n",
       "      <td id=\"T_8e976_row100_col7\" class=\"data row100 col7\" >0.634103</td>\n",
       "      <td id=\"T_8e976_row100_col8\" class=\"data row100 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row100_col9\" class=\"data row100 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row100_col10\" class=\"data row100 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row100_col11\" class=\"data row100 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row101\" class=\"row_heading level0 row101\" >67</th>\n",
       "      <td id=\"T_8e976_row101_col0\" class=\"data row101 col0\" >35</td>\n",
       "      <td id=\"T_8e976_row101_col1\" class=\"data row101 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row101_col2\" class=\"data row101 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row101_col3\" class=\"data row101 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row101_col4\" class=\"data row101 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row101_col5\" class=\"data row101 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row101_col6\" class=\"data row101 col6\" >0.667418</td>\n",
       "      <td id=\"T_8e976_row101_col7\" class=\"data row101 col7\" >0.632327</td>\n",
       "      <td id=\"T_8e976_row101_col8\" class=\"data row101 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row101_col9\" class=\"data row101 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row101_col10\" class=\"data row101 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row101_col11\" class=\"data row101 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row102\" class=\"row_heading level0 row102\" >81</th>\n",
       "      <td id=\"T_8e976_row102_col0\" class=\"data row102 col0\" >49</td>\n",
       "      <td id=\"T_8e976_row102_col1\" class=\"data row102 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row102_col2\" class=\"data row102 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row102_col3\" class=\"data row102 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row102_col4\" class=\"data row102 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row102_col5\" class=\"data row102 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row102_col6\" class=\"data row102 col6\" >0.664191</td>\n",
       "      <td id=\"T_8e976_row102_col7\" class=\"data row102 col7\" >0.628774</td>\n",
       "      <td id=\"T_8e976_row102_col8\" class=\"data row102 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row102_col9\" class=\"data row102 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row102_col10\" class=\"data row102 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row102_col11\" class=\"data row102 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row103\" class=\"row_heading level0 row103\" >119</th>\n",
       "      <td id=\"T_8e976_row103_col0\" class=\"data row103 col0\" >87</td>\n",
       "      <td id=\"T_8e976_row103_col1\" class=\"data row103 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row103_col2\" class=\"data row103 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row103_col3\" class=\"data row103 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row103_col4\" class=\"data row103 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row103_col5\" class=\"data row103 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row103_col6\" class=\"data row103 col6\" >0.665687</td>\n",
       "      <td id=\"T_8e976_row103_col7\" class=\"data row103 col7\" >0.628774</td>\n",
       "      <td id=\"T_8e976_row103_col8\" class=\"data row103 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row103_col9\" class=\"data row103 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row103_col10\" class=\"data row103 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row103_col11\" class=\"data row103 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row104\" class=\"row_heading level0 row104\" >113</th>\n",
       "      <td id=\"T_8e976_row104_col0\" class=\"data row104 col0\" >81</td>\n",
       "      <td id=\"T_8e976_row104_col1\" class=\"data row104 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row104_col2\" class=\"data row104 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row104_col3\" class=\"data row104 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row104_col4\" class=\"data row104 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row104_col5\" class=\"data row104 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row104_col6\" class=\"data row104 col6\" >0.651145</td>\n",
       "      <td id=\"T_8e976_row104_col7\" class=\"data row104 col7\" >0.628774</td>\n",
       "      <td id=\"T_8e976_row104_col8\" class=\"data row104 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row104_col9\" class=\"data row104 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row104_col10\" class=\"data row104 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row104_col11\" class=\"data row104 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row105\" class=\"row_heading level0 row105\" >53</th>\n",
       "      <td id=\"T_8e976_row105_col0\" class=\"data row105 col0\" >21</td>\n",
       "      <td id=\"T_8e976_row105_col1\" class=\"data row105 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row105_col2\" class=\"data row105 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row105_col3\" class=\"data row105 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row105_col4\" class=\"data row105 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row105_col5\" class=\"data row105 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row105_col6\" class=\"data row105 col6\" >0.672665</td>\n",
       "      <td id=\"T_8e976_row105_col7\" class=\"data row105 col7\" >0.628774</td>\n",
       "      <td id=\"T_8e976_row105_col8\" class=\"data row105 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row105_col9\" class=\"data row105 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row105_col10\" class=\"data row105 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row105_col11\" class=\"data row105 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row106\" class=\"row_heading level0 row106\" >25</th>\n",
       "      <td id=\"T_8e976_row106_col0\" class=\"data row106 col0\" >25</td>\n",
       "      <td id=\"T_8e976_row106_col1\" class=\"data row106 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row106_col2\" class=\"data row106 col2\" >LeakyReLU</td>\n",
       "      <td id=\"T_8e976_row106_col3\" class=\"data row106 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row106_col4\" class=\"data row106 col4\" >128-64-32</td>\n",
       "      <td id=\"T_8e976_row106_col5\" class=\"data row106 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row106_col6\" class=\"data row106 col6\" >0.662720</td>\n",
       "      <td id=\"T_8e976_row106_col7\" class=\"data row106 col7\" >0.625222</td>\n",
       "      <td id=\"T_8e976_row106_col8\" class=\"data row106 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row106_col9\" class=\"data row106 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row106_col10\" class=\"data row106 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row106_col11\" class=\"data row106 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row107\" class=\"row_heading level0 row107\" >111</th>\n",
       "      <td id=\"T_8e976_row107_col0\" class=\"data row107 col0\" >79</td>\n",
       "      <td id=\"T_8e976_row107_col1\" class=\"data row107 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row107_col2\" class=\"data row107 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row107_col3\" class=\"data row107 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row107_col4\" class=\"data row107 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row107_col5\" class=\"data row107 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row107_col6\" class=\"data row107 col6\" >0.633410</td>\n",
       "      <td id=\"T_8e976_row107_col7\" class=\"data row107 col7\" >0.623446</td>\n",
       "      <td id=\"T_8e976_row107_col8\" class=\"data row107 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row107_col9\" class=\"data row107 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row107_col10\" class=\"data row107 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row107_col11\" class=\"data row107 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row108\" class=\"row_heading level0 row108\" >43</th>\n",
       "      <td id=\"T_8e976_row108_col0\" class=\"data row108 col0\" >11</td>\n",
       "      <td id=\"T_8e976_row108_col1\" class=\"data row108 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row108_col2\" class=\"data row108 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row108_col3\" class=\"data row108 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row108_col4\" class=\"data row108 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row108_col5\" class=\"data row108 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row108_col6\" class=\"data row108 col6\" >0.635329</td>\n",
       "      <td id=\"T_8e976_row108_col7\" class=\"data row108 col7\" >0.621670</td>\n",
       "      <td id=\"T_8e976_row108_col8\" class=\"data row108 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row108_col9\" class=\"data row108 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row108_col10\" class=\"data row108 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row108_col11\" class=\"data row108 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row109\" class=\"row_heading level0 row109\" >99</th>\n",
       "      <td id=\"T_8e976_row109_col0\" class=\"data row109 col0\" >67</td>\n",
       "      <td id=\"T_8e976_row109_col1\" class=\"data row109 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row109_col2\" class=\"data row109 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row109_col3\" class=\"data row109 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row109_col4\" class=\"data row109 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row109_col5\" class=\"data row109 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row109_col6\" class=\"data row109 col6\" >0.628636</td>\n",
       "      <td id=\"T_8e976_row109_col7\" class=\"data row109 col7\" >0.616341</td>\n",
       "      <td id=\"T_8e976_row109_col8\" class=\"data row109 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row109_col9\" class=\"data row109 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row109_col10\" class=\"data row109 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row109_col11\" class=\"data row109 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row110\" class=\"row_heading level0 row110\" >41</th>\n",
       "      <td id=\"T_8e976_row110_col0\" class=\"data row110 col0\" >9</td>\n",
       "      <td id=\"T_8e976_row110_col1\" class=\"data row110 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row110_col2\" class=\"data row110 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row110_col3\" class=\"data row110 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row110_col4\" class=\"data row110 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row110_col5\" class=\"data row110 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row110_col6\" class=\"data row110 col6\" >0.639925</td>\n",
       "      <td id=\"T_8e976_row110_col7\" class=\"data row110 col7\" >0.616341</td>\n",
       "      <td id=\"T_8e976_row110_col8\" class=\"data row110 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row110_col9\" class=\"data row110 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row110_col10\" class=\"data row110 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row110_col11\" class=\"data row110 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row111\" class=\"row_heading level0 row111\" >109</th>\n",
       "      <td id=\"T_8e976_row111_col0\" class=\"data row111 col0\" >77</td>\n",
       "      <td id=\"T_8e976_row111_col1\" class=\"data row111 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row111_col2\" class=\"data row111 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row111_col3\" class=\"data row111 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row111_col4\" class=\"data row111 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row111_col5\" class=\"data row111 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row111_col6\" class=\"data row111 col6\" >0.651506</td>\n",
       "      <td id=\"T_8e976_row111_col7\" class=\"data row111 col7\" >0.614565</td>\n",
       "      <td id=\"T_8e976_row111_col8\" class=\"data row111 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row111_col9\" class=\"data row111 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row111_col10\" class=\"data row111 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row111_col11\" class=\"data row111 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row112\" class=\"row_heading level0 row112\" >1</th>\n",
       "      <td id=\"T_8e976_row112_col0\" class=\"data row112 col0\" >1</td>\n",
       "      <td id=\"T_8e976_row112_col1\" class=\"data row112 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row112_col2\" class=\"data row112 col2\" >ReLU</td>\n",
       "      <td id=\"T_8e976_row112_col3\" class=\"data row112 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row112_col4\" class=\"data row112 col4\" >128-64-32</td>\n",
       "      <td id=\"T_8e976_row112_col5\" class=\"data row112 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row112_col6\" class=\"data row112 col6\" >0.665929</td>\n",
       "      <td id=\"T_8e976_row112_col7\" class=\"data row112 col7\" >0.612789</td>\n",
       "      <td id=\"T_8e976_row112_col8\" class=\"data row112 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row112_col9\" class=\"data row112 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row112_col10\" class=\"data row112 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row112_col11\" class=\"data row112 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row113\" class=\"row_heading level0 row113\" >105</th>\n",
       "      <td id=\"T_8e976_row113_col0\" class=\"data row113 col0\" >73</td>\n",
       "      <td id=\"T_8e976_row113_col1\" class=\"data row113 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row113_col2\" class=\"data row113 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row113_col3\" class=\"data row113 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row113_col4\" class=\"data row113 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row113_col5\" class=\"data row113 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row113_col6\" class=\"data row113 col6\" >0.658312</td>\n",
       "      <td id=\"T_8e976_row113_col7\" class=\"data row113 col7\" >0.605684</td>\n",
       "      <td id=\"T_8e976_row113_col8\" class=\"data row113 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row113_col9\" class=\"data row113 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row113_col10\" class=\"data row113 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row113_col11\" class=\"data row113 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row114\" class=\"row_heading level0 row114\" >37</th>\n",
       "      <td id=\"T_8e976_row114_col0\" class=\"data row114 col0\" >5</td>\n",
       "      <td id=\"T_8e976_row114_col1\" class=\"data row114 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row114_col2\" class=\"data row114 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row114_col3\" class=\"data row114 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row114_col4\" class=\"data row114 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row114_col5\" class=\"data row114 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row114_col6\" class=\"data row114 col6\" >0.652759</td>\n",
       "      <td id=\"T_8e976_row114_col7\" class=\"data row114 col7\" >0.605684</td>\n",
       "      <td id=\"T_8e976_row114_col8\" class=\"data row114 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row114_col9\" class=\"data row114 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row114_col10\" class=\"data row114 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row114_col11\" class=\"data row114 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row115\" class=\"row_heading level0 row115\" >73</th>\n",
       "      <td id=\"T_8e976_row115_col0\" class=\"data row115 col0\" >41</td>\n",
       "      <td id=\"T_8e976_row115_col1\" class=\"data row115 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row115_col2\" class=\"data row115 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row115_col3\" class=\"data row115 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row115_col4\" class=\"data row115 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row115_col5\" class=\"data row115 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row115_col6\" class=\"data row115 col6\" >0.659291</td>\n",
       "      <td id=\"T_8e976_row115_col7\" class=\"data row115 col7\" >0.598579</td>\n",
       "      <td id=\"T_8e976_row115_col8\" class=\"data row115 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row115_col9\" class=\"data row115 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row115_col10\" class=\"data row115 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row115_col11\" class=\"data row115 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row116\" class=\"row_heading level0 row116\" >33</th>\n",
       "      <td id=\"T_8e976_row116_col0\" class=\"data row116 col0\" >1</td>\n",
       "      <td id=\"T_8e976_row116_col1\" class=\"data row116 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row116_col2\" class=\"data row116 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row116_col3\" class=\"data row116 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row116_col4\" class=\"data row116 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row116_col5\" class=\"data row116 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row116_col6\" class=\"data row116 col6\" >0.648887</td>\n",
       "      <td id=\"T_8e976_row116_col7\" class=\"data row116 col7\" >0.596803</td>\n",
       "      <td id=\"T_8e976_row116_col8\" class=\"data row116 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row116_col9\" class=\"data row116 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row116_col10\" class=\"data row116 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row116_col11\" class=\"data row116 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row117\" class=\"row_heading level0 row117\" >17</th>\n",
       "      <td id=\"T_8e976_row117_col0\" class=\"data row117 col0\" >17</td>\n",
       "      <td id=\"T_8e976_row117_col1\" class=\"data row117 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row117_col2\" class=\"data row117 col2\" >LeakyReLU</td>\n",
       "      <td id=\"T_8e976_row117_col3\" class=\"data row117 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row117_col4\" class=\"data row117 col4\" >128-64-32</td>\n",
       "      <td id=\"T_8e976_row117_col5\" class=\"data row117 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row117_col6\" class=\"data row117 col6\" >0.710187</td>\n",
       "      <td id=\"T_8e976_row117_col7\" class=\"data row117 col7\" >0.589698</td>\n",
       "      <td id=\"T_8e976_row117_col8\" class=\"data row117 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row117_col9\" class=\"data row117 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row117_col10\" class=\"data row117 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row117_col11\" class=\"data row117 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row118\" class=\"row_heading level0 row118\" >77</th>\n",
       "      <td id=\"T_8e976_row118_col0\" class=\"data row118 col0\" >45</td>\n",
       "      <td id=\"T_8e976_row118_col1\" class=\"data row118 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row118_col2\" class=\"data row118 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row118_col3\" class=\"data row118 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row118_col4\" class=\"data row118 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row118_col5\" class=\"data row118 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row118_col6\" class=\"data row118 col6\" >0.648749</td>\n",
       "      <td id=\"T_8e976_row118_col7\" class=\"data row118 col7\" >0.589698</td>\n",
       "      <td id=\"T_8e976_row118_col8\" class=\"data row118 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row118_col9\" class=\"data row118 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row118_col10\" class=\"data row118 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row118_col11\" class=\"data row118 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row119\" class=\"row_heading level0 row119\" >29</th>\n",
       "      <td id=\"T_8e976_row119_col0\" class=\"data row119 col0\" >29</td>\n",
       "      <td id=\"T_8e976_row119_col1\" class=\"data row119 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row119_col2\" class=\"data row119 col2\" >LeakyReLU</td>\n",
       "      <td id=\"T_8e976_row119_col3\" class=\"data row119 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row119_col4\" class=\"data row119 col4\" >32-64-128</td>\n",
       "      <td id=\"T_8e976_row119_col5\" class=\"data row119 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row119_col6\" class=\"data row119 col6\" >0.719664</td>\n",
       "      <td id=\"T_8e976_row119_col7\" class=\"data row119 col7\" >0.582593</td>\n",
       "      <td id=\"T_8e976_row119_col8\" class=\"data row119 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row119_col9\" class=\"data row119 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row119_col10\" class=\"data row119 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row119_col11\" class=\"data row119 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row120\" class=\"row_heading level0 row120\" >13</th>\n",
       "      <td id=\"T_8e976_row120_col0\" class=\"data row120 col0\" >13</td>\n",
       "      <td id=\"T_8e976_row120_col1\" class=\"data row120 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row120_col2\" class=\"data row120 col2\" >ReLU</td>\n",
       "      <td id=\"T_8e976_row120_col3\" class=\"data row120 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row120_col4\" class=\"data row120 col4\" >32-64-128</td>\n",
       "      <td id=\"T_8e976_row120_col5\" class=\"data row120 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row120_col6\" class=\"data row120 col6\" >0.778426</td>\n",
       "      <td id=\"T_8e976_row120_col7\" class=\"data row120 col7\" >0.555950</td>\n",
       "      <td id=\"T_8e976_row120_col8\" class=\"data row120 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row120_col9\" class=\"data row120 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row120_col10\" class=\"data row120 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row120_col11\" class=\"data row120 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row121\" class=\"row_heading level0 row121\" >61</th>\n",
       "      <td id=\"T_8e976_row121_col0\" class=\"data row121 col0\" >29</td>\n",
       "      <td id=\"T_8e976_row121_col1\" class=\"data row121 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row121_col2\" class=\"data row121 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row121_col3\" class=\"data row121 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row121_col4\" class=\"data row121 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row121_col5\" class=\"data row121 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row121_col6\" class=\"data row121 col6\" >0.689890</td>\n",
       "      <td id=\"T_8e976_row121_col7\" class=\"data row121 col7\" >0.552398</td>\n",
       "      <td id=\"T_8e976_row121_col8\" class=\"data row121 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row121_col9\" class=\"data row121 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row121_col10\" class=\"data row121 col10\" >6.000000</td>\n",
       "      <td id=\"T_8e976_row121_col11\" class=\"data row121 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row122\" class=\"row_heading level0 row122\" >57</th>\n",
       "      <td id=\"T_8e976_row122_col0\" class=\"data row122 col0\" >25</td>\n",
       "      <td id=\"T_8e976_row122_col1\" class=\"data row122 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row122_col2\" class=\"data row122 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row122_col3\" class=\"data row122 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row122_col4\" class=\"data row122 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row122_col5\" class=\"data row122 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row122_col6\" class=\"data row122 col6\" >0.690501</td>\n",
       "      <td id=\"T_8e976_row122_col7\" class=\"data row122 col7\" >0.539964</td>\n",
       "      <td id=\"T_8e976_row122_col8\" class=\"data row122 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row122_col9\" class=\"data row122 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row122_col10\" class=\"data row122 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row122_col11\" class=\"data row122 col11\" >GEGLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row123\" class=\"row_heading level0 row123\" >21</th>\n",
       "      <td id=\"T_8e976_row123_col0\" class=\"data row123 col0\" >21</td>\n",
       "      <td id=\"T_8e976_row123_col1\" class=\"data row123 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row123_col2\" class=\"data row123 col2\" >LeakyReLU</td>\n",
       "      <td id=\"T_8e976_row123_col3\" class=\"data row123 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row123_col4\" class=\"data row123 col4\" >32-64-128</td>\n",
       "      <td id=\"T_8e976_row123_col5\" class=\"data row123 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row123_col6\" class=\"data row123 col6\" >0.726256</td>\n",
       "      <td id=\"T_8e976_row123_col7\" class=\"data row123 col7\" >0.539964</td>\n",
       "      <td id=\"T_8e976_row123_col8\" class=\"data row123 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row123_col9\" class=\"data row123 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row123_col10\" class=\"data row123 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row123_col11\" class=\"data row123 col11\" >nan</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row124\" class=\"row_heading level0 row124\" >95</th>\n",
       "      <td id=\"T_8e976_row124_col0\" class=\"data row124 col0\" >63</td>\n",
       "      <td id=\"T_8e976_row124_col1\" class=\"data row124 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row124_col2\" class=\"data row124 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row124_col3\" class=\"data row124 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row124_col4\" class=\"data row124 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row124_col5\" class=\"data row124 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row124_col6\" class=\"data row124 col6\" >0.701837</td>\n",
       "      <td id=\"T_8e976_row124_col7\" class=\"data row124 col7\" >0.502664</td>\n",
       "      <td id=\"T_8e976_row124_col8\" class=\"data row124 col8\" >4.000000</td>\n",
       "      <td id=\"T_8e976_row124_col9\" class=\"data row124 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row124_col10\" class=\"data row124 col10\" >3.000000</td>\n",
       "      <td id=\"T_8e976_row124_col11\" class=\"data row124 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row125\" class=\"row_heading level0 row125\" >115</th>\n",
       "      <td id=\"T_8e976_row125_col0\" class=\"data row125 col0\" >83</td>\n",
       "      <td id=\"T_8e976_row125_col1\" class=\"data row125 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row125_col2\" class=\"data row125 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row125_col3\" class=\"data row125 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row125_col4\" class=\"data row125 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row125_col5\" class=\"data row125 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row125_col6\" class=\"data row125 col6\" >0.680208</td>\n",
       "      <td id=\"T_8e976_row125_col7\" class=\"data row125 col7\" >0.502664</td>\n",
       "      <td id=\"T_8e976_row125_col8\" class=\"data row125 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row125_col9\" class=\"data row125 col9\" >32.000000</td>\n",
       "      <td id=\"T_8e976_row125_col10\" class=\"data row125 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row125_col11\" class=\"data row125 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row126\" class=\"row_heading level0 row126\" >78</th>\n",
       "      <td id=\"T_8e976_row126_col0\" class=\"data row126 col0\" >46</td>\n",
       "      <td id=\"T_8e976_row126_col1\" class=\"data row126 col1\" >1-FTTransformerConfig</td>\n",
       "      <td id=\"T_8e976_row126_col2\" class=\"data row126 col2\" >nan</td>\n",
       "      <td id=\"T_8e976_row126_col3\" class=\"data row126 col3\" >0.000000</td>\n",
       "      <td id=\"T_8e976_row126_col4\" class=\"data row126 col4\" >nan</td>\n",
       "      <td id=\"T_8e976_row126_col5\" class=\"data row126 col5\" >Adam</td>\n",
       "      <td id=\"T_8e976_row126_col6\" class=\"data row126 col6\" >0.693390</td>\n",
       "      <td id=\"T_8e976_row126_col7\" class=\"data row126 col7\" >0.493783</td>\n",
       "      <td id=\"T_8e976_row126_col8\" class=\"data row126 col8\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row126_col9\" class=\"data row126 col9\" >64.000000</td>\n",
       "      <td id=\"T_8e976_row126_col10\" class=\"data row126 col10\" >8.000000</td>\n",
       "      <td id=\"T_8e976_row126_col11\" class=\"data row126 col11\" >LeakyReLU</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th id=\"T_8e976_level0_row127\" class=\"row_heading level0 row127\" >9</th>\n",
       "      <td id=\"T_8e976_row127_col0\" class=\"data row127 col0\" >9</td>\n",
       "      <td id=\"T_8e976_row127_col1\" class=\"data row127 col1\" >0-CategoryEmbeddingModelConfig</td>\n",
       "      <td id=\"T_8e976_row127_col2\" class=\"data row127 col2\" >ReLU</td>\n",
       "      <td id=\"T_8e976_row127_col3\" class=\"data row127 col3\" >0.200000</td>\n",
       "      <td id=\"T_8e976_row127_col4\" class=\"data row127 col4\" >128-64-32</td>\n",
       "      <td id=\"T_8e976_row127_col5\" class=\"data row127 col5\" >SGD</td>\n",
       "      <td id=\"T_8e976_row127_col6\" class=\"data row127 col6\" >0.781076</td>\n",
       "      <td id=\"T_8e976_row127_col7\" class=\"data row127 col7\" >0.433393</td>\n",
       "      <td id=\"T_8e976_row127_col8\" class=\"data row127 col8\" >nan</td>\n",
       "      <td id=\"T_8e976_row127_col9\" class=\"data row127 col9\" >nan</td>\n",
       "      <td id=\"T_8e976_row127_col10\" class=\"data row127 col10\" >nan</td>\n",
       "      <td id=\"T_8e976_row127_col11\" class=\"data row127 col11\" >nan</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n"
      ],
      "text/plain": [
       "<pandas.io.formats.style.Styler at 0x792b5c344fd0>"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tuner_df.trials_df.sort_values(\"accuracy\", ascending=False).style.background_gradient(\n",
    "    subset=[\"accuracy\"], cmap=\"RdYlGn\"\n",
    ").background_gradient(subset=[\"loss\"], cmap=\"RdYlGn_r\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "b4dc5aa1-9bd5-4634-9043-09cef9b89b24",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\">┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
       "┃<span style=\"font-weight: bold\">        Test metric        </span>┃<span style=\"font-weight: bold\">       DataLoader 0        </span>┃\n",
       "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n",
       "│<span style=\"color: #008080; text-decoration-color: #008080\">       test_accuracy       </span>│<span style=\"color: #800080; text-decoration-color: #800080\">    0.8173333406448364     </span>│\n",
       "│<span style=\"color: #008080; text-decoration-color: #008080\">         test_loss         </span>│<span style=\"color: #800080; text-decoration-color: #800080\">    0.38250666856765747    </span>│\n",
       "└───────────────────────────┴───────────────────────────┘\n",
       "</pre>\n"
      ],
      "text/plain": [
       "┏━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━┓\n",
       "┃\u001b[1m \u001b[0m\u001b[1m       Test metric       \u001b[0m\u001b[1m \u001b[0m┃\u001b[1m \u001b[0m\u001b[1m      DataLoader 0       \u001b[0m\u001b[1m \u001b[0m┃\n",
       "┡━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━┩\n",
       "│\u001b[36m \u001b[0m\u001b[36m      test_accuracy      \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m   0.8173333406448364    \u001b[0m\u001b[35m \u001b[0m│\n",
       "│\u001b[36m \u001b[0m\u001b[36m        test_loss        \u001b[0m\u001b[36m \u001b[0m│\u001b[35m \u001b[0m\u001b[35m   0.38250666856765747   \u001b[0m\u001b[35m \u001b[0m│\n",
       "└───────────────────────────┴───────────────────────────┘\n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    },
    {
     "data": {
      "text/plain": [
       "[{'test_loss': 0.38250666856765747, 'test_accuracy': 0.8173333406448364}]"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tuner_df.best_model.evaluate(test)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "0ffd6ccf-8733-4dff-b13a-1b994158cf81",
   "metadata": {},
   "source": [
    "After training, the best model will be saved in output variable as \"best_model\". So if you liked the result and wish to use the model in the future, you can save it calling \"save_model\".\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "id": "39835a62-f50a-45ba-8b0a-1d5e5c30188f",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<pre style=\"white-space:pre;overflow-x:auto;line-height:normal;font-family:Menlo,'DejaVu Sans Mono',consolas,'Courier New',monospace\"><span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">2024</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">07</span>-<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">20</span> <span style=\"color: #00ff00; text-decoration-color: #00ff00; font-weight: bold\">12:58:01</span>,<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">015</span> - <span style=\"font-weight: bold\">{</span>pytorch_tabular.tabular_model:<span style=\"color: #008080; text-decoration-color: #008080; font-weight: bold\">1572</span><span style=\"font-weight: bold\">}</span> - WARNING - Directory is not empty. Overwriting the \n",
       "contents.                                                                                                          \n",
       "</pre>\n"
      ],
      "text/plain": [
       "\u001b[1;36m2024\u001b[0m-\u001b[1;36m07\u001b[0m-\u001b[1;36m20\u001b[0m \u001b[1;92m12:58:01\u001b[0m,\u001b[1;36m015\u001b[0m - \u001b[1m{\u001b[0mpytorch_tabular.tabular_model:\u001b[1;36m1572\u001b[0m\u001b[1m}\u001b[0m - WARNING - Directory is not empty. Overwriting the \n",
       "contents.                                                                                                          \n"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "tuner_df.best_model.save_model(\"best_model\", inference_only=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "id": "1e6921e4-4cd9-4c82-8706-6731737da5f2",
   "metadata": {},
   "outputs": [],
   "source": [
    "# Load saved model\n",
    "#from pytorch_tabular import TabularModel\n",
    "#loaded_model = TabularModel.load_model(\"best_model\")\n",
    "#loaded_model.evaluate(test)"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
